Interfacing Assembly Language and C
===================================


Register Usage under the ARM Procedure Call Standard
----------------------------------------------------


About this Recipe
.................

In this recipe you will learn about:

 *  the basic issues involved with interfacing ARM Assembly Language code to C 
    programs;

 *  the basic concepts of the ARM Procedure Call Standard (or <APCS>), with 
    more detail on register usage issues.

The supporting example illustrates:

 *  a simple function written in assembler which is linkable with C modules;

 *  some of the issues involved with the APCS.


Introduction to the APCS
........................

The ARM Procedure Call Standard is a set of rules which govern calls between 
functions which are visible between separately compiled or assembled code 
fragments.

The following are defined by the APCS:

 *  constraints on the use of registers;

 *  stack conventions;

 *  the format of a stack backtrace data structure;

 *  argument passing and result return;

 *  support for the ARM shared library mechanism.

Code which is produced by compilers is expected to adhere to the APCS at all 
times.  Such code is said to be <strictly conforming>.

Hand written code is expected to adhere to the APCS when making calls to 
externally visible functions.  Such code is said to be <conforming>.

The ARM Procdeure Call Standard comprises a family of variants.  The following 
independent choices need to be made to fix the variant of the APCS required:

 *  Is the Program Counter 32-bit or 26-bit?

 *  Is stack limit checking explicit or implicit? ie. is stack limit checking 
    performed by code, or is it checked by memory management hardware?

 *  Should floating point values be passed in floating point registers?

 *  Is code reentrant or non-reentrant?

Code which conforms to one APCS variant conforms to none of the other variants.

For the full specification of the APCS see "<ARM Procedure Call Standard>" 
starting on page38 of the Technical Specifications.


Register Names and Usage under the APCS
.......................................

The following table summarises the names and uses allocated to the ARM and 
Floating Point registers under the APCS (note that not all ARM systems support 
floating point):

    Name          Register    APCS Role

    a1            r0          argument 1 / integer result
    a2            r1          argument 2
    a3            r2          argument 3
    a4            r3          argument 4

    v1-v5         r4-r8       register variables

    sb            r9          static base
    sl            r10         stack limit / stack chunk handle
    fp            r11         frame pointer
    ip            r12         new-static base in inter-link-unit calls
    sp            r13         lower end of current stack frame
    lr            r14         link address
    pc            r15         program counter

    f0            f0          FP argument 1 / FP result
    f1            f1          FP argument 2
    f2            f2          FP argument 3
    f3            f3          FP argument 4
    f4-f7         f4-f7       FP register variables

Simplistically:

    a1-a4, f0-f3are used to pass arguments to functions.  a1 is also used to 
                return integer results, and f0 to return FP results.  These 
                registers can be corrupted by a called function.

    v1-v5, f4-f7are used as register variables.  They must be preserved by 
                called functions.

    sb,sl,fp,ip,sp,lr,pchave a dedicated role in some APCS variants, some of 
                the time.  ie. there are times when some of these registers can 
                be used for other purposes even when strictly conforming to the 
                APCS.  In some variants of the APCS sb and sl are available as 
                additional variable registers v6 and v7 respectively.

As stated previously, hand coded assembler routines need not <conform strictly> 
to the APCS, but need only <conform>.  This means that all registers which do 
not need to be used in their APCS role by an assembler routine (eg. fp) can be 
used as working registers as long as their value on entry is restored before 
returning.


64 Bit Integer Addition
.......................

The purpose of this example is to examine coding a small function in ARM 
Assembly Language, in a way which will enable it to be used from C modules.  
First, however, the function is coded in C, and the compiler's output examined.

Let us consider writing a 64 bit integer addition routine in C, where the data 
structure used to store 64 bit integers is a two word structure.  The obvious 
way to code the addition of these double length integers in assembler is to 
make use of the Carry flag from the low word addition in the high word 
addition.  However, there is no way to specify this in C.

A possible way to code around this in C is as follows:

    void add_64(int64 *dest, int64 *src1, int64 *src2)
    { unsigned hibit1=src1->lo >> 31, hibit2=src2->lo >> 31, hibit3;
      dest->lo=src1->lo + src2->lo;
      hibit3=dest->lo >> 31;
      dest->hi=src1->hi + src2->hi +
               ((hibit1 & hibit2) || (hibit1!= hibit3));
      return;
    }


Explanation

The highest bits of the low words in the two operands are calculated (shifting 
them into bit 0, while clearing the rest of the register). These are then used 
to determine the value of the carry bit (in the same way as the ARM itself 
does).


Examining the Compiler's Output

If the 64 bit integer addition routine is used a great deal, then a poor 
implementation such as this is likely to be inadequate.  To see just how good 
or bad this implementation is let us look at the actual code which the compiler 
produces.

Set the current directory to <examples>.  The above code can be found in 
<add64_1.c>, which we can compile to ARM Assembly Language source as follows:

    armcc -li -apcs 3/32bit -S add64_1.c

The -S flag tells <armcc> to produce ARM Assembly Language source (suitable for 
<armasm>) rather than producing object code.  The -li flag tells <armcc> to 
compile for a little-endian memory and the -apcs option specifies that the 32 
bit version of APCS 3 should be used.  You can omit these options if your <armcc> 
has been configured for this default (see "<The ARM Tool Reconfiguration 
Utility (reconfig)>" of the User Manual for details).

Looking at the output file, <add64_1.s>, we can see that this is indeed an 
inefficient implementation.


Modifying the Compiler's Output

Let us go back to the original intention of coding the 64 bit integer addition 
using the ARM's Carry flag.  Since use of the Carry flag cannot be specified in 
C, we can get the compiler to produce almost the right code, and then modify it 
by hand.  Let us start with (incorrect) code which does not perform the carry 
addition:

    void add_64(int64 *dest, int64 *src1, int64 *src2)
    { dest->lo=src1->lo + src2->lo;
      dest->hi=src1->hi + src2->hi;
      return;
    }

To compile this to give assembler suitable for use with <armasm> first set the 
current directory to <examples>, and issue this command (the options used are 
described above):

    armcc -li -apcs 3/32bit -S add64_2.c

This will produce the source in <add64_2.s>, which will include the following 
code:

    add_64
        LDR    a4,[a2,#0]
        LDR    ip,[a3,#0]
        ADD    a4,a4,ip
        STR    a4,[a1,#0]
        LDR    a2,[a2,#4]
        LDR    a3,[a3,#4]
        ADD    a2,a2,a3
        STR    a2,[a1,#4]
        MOV    pc,lr

Looking at this carefully comparing it to the C source we can see that the 
first ADD instruction produces the low order word, and the second produces the 
high order word.  All we need to do to get the carry from the low to high word 
right is change the first ADD to ADDS (add and set flags), and the second ADD 
to an ADC (add with carry).  This modified code is available in the <examples> 
directory as <add64_3.s>.


What effect did the APCS have on this example ?

Look at the above code again.  The most obvious may in which the APCS has 
affected the code produced is that the registers are all given APCS style 
names, as introduced earlier in this recipe.

a1 clearly holds a pointer to the destination structure, a2 and a3 pointers to 
the operand structures.  Both a4 and ip are used as temporary registers, which 
are not preserved.  The conditions under which ip can be corrupted will be 
discussed later in this recipe.

This is a simple leaf function, which uses few temporary registers.  Therefore 
no registers are saved to the stack, and none need to be restored on exit.  
Thus a simple "MOV pc,lr" can be used to return.

If we had wished to return a result, perhaps the carry out from this addition, 
then it would be loaded into a1 prior to exit.  In this example, this could be 
done by changing the second ADD to ADCS (add with carry and set flags), and 
adding the following instructions to load a1 with 1 or 0 depending on the carry 
out from the high order addition.

        MOV    a1, #0
        ADC    a1, a1, #0


Back to the first inefficient implementation

Although the first C implementation was inefficient, it shows us more about the 
APCS than the more efficient hand modified version.

We have already seen a4 and ip being used as non-preserved temporary registers.  
However, here v1 and lr are also used as temporary registers.  v1 is preserved 
by storing it (together with lr) on entry.  lr is corrupted, but a copy is 
saved, onto the stack, and is reloaded into pc at the same time that v1 is 
restored.

Thus there is still only a single exit instruction, but now it is:

        LDMIA  sp!,{v1,pc}


More Detailed APCS Register Usage Information
.............................................

It was stated initially that sb,sl,fp,ip,sp and lr are dedicated registers, but 
in the example we saw ip and lr being used as temporary registers.  Indeed, 
there are times when these registers are not used for their APCS roles, and it 
is useful to know about these situations, so that efficient (but safe) code can 
be written to make use of as many of the registers as possible and thereby 
avoid doing unnecessary register saving and restoring.

    ip          This register is used only during function call.  It is 
                conventionally used as a local code generation temporary 
                register.  At other times it can be used as a corruptible 
                temporary register.  

    lr          This register holds the address to which control must return on 
                function exit.  It can be (and often is) used as a temporary 
                register after pushing its contents onto the stack.  This value 
                can then be reloaded straight into the PC, as was the case in "
                <Back to the first inefficient implementation>" starting on 
                page65.

    sp          This is the stack pointer, which is always valid in <strictly 
                conforming> code, but need only be preserved in hand written 
                code.  Note, however, that if any use of the stack is to be 
                made by hand written code, sp must be available. 

    sl          This is the stack limit register.  If stack limit checking is 
                explicit (ie. it is performed by code when stack pushes occur, 
                rather than by memory management hardware causing a trap when 
                stack overflow occurs), then sl must be valid whenever sp is 
                valid.  If stack checking is implicit sl is instead treated as 
                v7, an additional register variable (which must be preserved by 
                called functions).

    fp          This is the frame pointer register.  It contains either zero, 
                or a pointer to the most recently created stack backtrace data 
                structure.  As with the stack pointer, this must be preserved, 
                but in hand written code need not be available at all instants.  
                It should, however, be valid whenever any <strictly conforming> 
                functions are called.  For more information refer to "<Function 
                Invocations and Backtrace Structures>" of 
                the Technical Specifications.

    sb          This is the static base register. If a the variant of the APCS 
                being used is reentrant, then this register is used to access 
                an array of static data pointers to allow code to access data 
                reentrantly.  For more information see "<Reentrant vs 
                Non-Reentrant Code>" of the Technical 
                Specifications.  However, if the variant of the APCS being used 
                is not reentrant then sb is instead available as an additional 
                register variable, v6 (which must be preserved by called 
                functions).

Thus sp,sl,fp and sb must all be preserved on function exit for APCS <conforming> 
code.


Related Topics
..............

 *  "<Passing and Returning structs>";

 *  "<In-Line SWIs>".


Passing and Returning structs
-----------------------------


About this Recipe
.................

In this recipe you will learn about:

 *  the way structs are normally passed to and from functions;

 *  cases when this is automatically optimised;

 *  how to tell the compiler to return a struct value using several registers.


The Default Way to Pass and Return a struct
...........................................

Unless special conditions apply (detailed in following sections), C structures 
are:

 *  passed in registers which if necessary overflow onto the stack;

 *  returned via a pointer to the memory location of the result.

For struct-valued functions a pointer to the location where the struct result 
is to be placed is passed in a1, (the first argument register).  The first 
argument is then passed in a2, the second in a3 etc.

It is as if:

    struct s f(int x)

were compiled as:

    void f(struct s *result, int x)

As a demonstration of the default way in which structures are passed and 
returned consider the following code:

    typedef struct two_ch_struct
    { char ch1;
      char ch2;
    } two_ch;
    
    two_ch max( two_ch a, two_ch b )
    { return (a.ch1>b.ch1) ? a : b;
    }

This code is available in the <examples> directory as <two_ch.c>.  It can be 
compiled to produce Assembly Language source by using the following command:

    armcc -S two_ch.c -li -apcs 3/32bit

Where -li and -apcs 3/32bit can be omitted if <armcc> has been configured 
appropriately already - see "<The ARM Tool Reconfiguration Utility (reconfig)>" 
starting on page45 of the User Manual for more details.

Here is the code which <armcc> produces:

    max
        MOV    ip,sp
        STMDB  sp!,{a1-a3,fp,ip,lr,pc}
        SUB    fp,ip,#4
        LDRB   a3,[fp,#-&14]
        LDRB   a2,[fp,#-&10]
        CMP    a3,a2
        SUBLE  a2,fp,#&10
        SUBGT  a2,fp,#&14
        LDR    a2,[a2,#0]
        STR    a2,[a1,#0]
        LDMDB  fp,{fp,sp,pc}

The STMDB instruction saves the arguments onto the stack, together with the 
frame pointer, stack pointer, link register and current pc value (this sequence 
of values is the stack backtrace data structure).

a2 and a3 are then used as temporary registers to hold the the required part of 
the strucures passed, and a1 as a pointer to an area in memory in which the 
resulting struct is placed - all as expected.

For a basic explanation of register naming and usage under the APCS, see "
<Register Usage under the ARM Procedure Call Standard>".  
Detailed information can be found in "<C Language Calling Conventions>" 
starting on page47 of the Technical Specifications.


The Optimisation of Integer-like Structures
...........................................

The ARM Procedure Call Standard specifies different rules for returning 
<integer-like> structs.  An integer-like struct is one which has the following 
properties:

 *  The size of the struct is no larger than one word;

 *  The byte offset of each addressable sub-field is 0 (bit-fields are not 
    addressable).

Thus the following structs are integer-like:

    struct

    { unsigned a:8, b:8, c:8, d:8;
    }
    
    union polymorphic_ptr
    { struct A *a;
      struct B *b;
      int      *i;
    }

Whereas the structure used in the previous example is not integer-like:

    struct { char ch1, ch2; }

Integer-like structs are returned by returning the struct's contents in a1 
rather than a pointer to the struct's contents.  Thus a1 is not needed to pass 
a pointer to a result struct in memory, and is instead be used to pass the 
first argument.

For example, consider the following code:

    typedef struct half_words_struct
    { unsigned field1:16;
      unsigned field2:16;
    } half_words;
    
    half_words max( half_words a, half_words b )

    { half_words x;
      x= (a.field1>b.field1) ? a : b;
      return x;
    }

We would expect arguments a and b to be passed in registers a1 and a2, and 
since half_word_struct is integer-like we expect the result structure to be 
passed back directly in a1, (rather than a1 being used to return a pointer to 
the result half_words_struct).

The above code is available in the <examples> directory as <half_str.c>.  It 
can be compiled to produce Assembly Language source by using the following 
command:

    armcc -S half_str.c -li -apcs 3/32bit

Where -li and -apcs 3/32bit can be omitted if <armcc> has been configured 
appropriately already - see "<The ARM Tool Reconfiguration Utility (reconfig)>" 
starting on page45 of the User Manual for more details.

Here is the code which <armcc> produces:

    max
        MOV    a3,a1,LSL #16
        MOV    a3,a3,LSR #16
        MOV    a4,a2,LSL #16
        MOV    a4,a4,LSR #16
        CMP    a3,a4
        MOVLE  a1,a2
        MOV    pc,lr

Clearly the contents of the <half_words> structure is returned directly in a1 
as expected.


Returning Non Integer-Like structs in Registers
...............................................

There are occasions when a function needs to return more than one value.  The 
normal way to achieve this is to define a structure which holds all the values 
to be returned, and return this.

As we have seen, this will result in a pointer to the structure being passed in 
a1, which will then be dereferenced to store the values returned.

For some applications in which such a function is time critical, the overhead 
involved in "wrapping" and then "unwrapping" this structure can be significant.  
However, there is a way to tell the compiler that a structure should be 
returned in the argument registers a1 - a4.  Clearly this is only useful for 
returning structures which are no larger than 4 words.

The way to tell the compiler to return a structure in the argument registers is 
to use the keyword "__value_in_regs".


Multiplication - Returning a 64-bit Result

To illustrate how to use __value_in_regs, let us consider writing a function 
which multiplies two 32-bit integers together and returns the 64-bit result.

The way this function must work is to split the two 32-bit numbers (a, b) into 
high and low 16-bit parts,(a_hi, a_lo, b_hi, b_lo).  The four multiplications 
a_lo * b_lo, a_hi * b_lo, a_lo * b_hi, a_hi * b_lo must be performed, and the 
results added together, taking care to deal with carry correctly.

Since the problem involves dealing with carry correctly, coding this function 
in C will not produce optimal code (see "<64 Bit Integer Addition>" starting on 
page63 for more details).  Therefore we will want to code the function in ARM 
Assembly Language.  The following code performs the algorithm just described:

    ; On entry a1 and a2 contain the 32-bit integers to be multiplied (a, b)
    ; On exit a1 and a2 contain the result (a1 bits 0-31, a2 bits 32-63) 
    mul64
        MOV    ip, a1, LSR #16        ; ip = a_hi
        MOV    a4, a2, LSR #16        ; a4 = b_hi
        BIC    a1, a1, ip, LSL #16    ; a1 = a_lo
        BIC    a2, a2, a4, LSL #16    ; a2 = b_lo
        MUL    a3, a1, a2             ; a3 = a_lo * b_lo        (m_lo)
        MUL    a2, ip, a2             ; a2 = a_hi * b_lo        (m_mid1)
        MUL    a1, a4, a1             ; a1 = a_lo * b_hi        (m_mid2)
        MUL    a4, ip, a4             ; a4 = a_hi * b_hi        (m_hi)
        ADDS   ip, a2, a1             ; ip = m_mid1 + m_mid2    (m_mid)
        ADDCS  a4, a4, #&10000        ; a4 = m_hi + carry       (m_hi')
        ADDS   a1, a3, ip, LSL #16    ; a1 = m_lo + (m_mid<<16)
        ADC    a2, a4, ip, LSR #16    ; a2 = m_hi' + (m_mid>>16) + carry
        MOV    pc, lr

Clearly this code is fine for use with Assembly language modules, but in order 
to use it from C we need to be able tell the compiler that this routine returns 
its 64-bit result in registers.  This can be done by making the following 
declarations in a header file:

    typedef struct int64_struct

    { unsigned int lo;
      unsigned int hi;
    } int64;
    
    __value_in_regs extern int64 mul64(unsigned a, unsigned b);

The Assembly Language code above, and the declarations above together with a 
test program are all in the <examples> directory, as the files: <mul64.s>, 
<mul64.h>, <int64.h> and <multest.c>.  To compile, assemble and link these to 
produce an executable image suitable for <armsd> first set your current 
directory to <examples>, and then execute the following commands:

    armasm mul64.s -o mul64.o -li
    armcc -c multest.c -li -apcs 3/32bit
    armlink mul64.o multest.o <somewhere>/armlib.32l -o multest

Where <somewhere> is the directory in which the semi-hosted C libraries reside 
(eg. the <lib> directory of the ARM Software Tools Release).  Note also that 
<-li> and <-apcs 3/32bit> can be omitted if <armcc> and <armasm> (and <armsd> 
below) have been configured appropriately - see "<The ARM Tool Reconfiguration 
Utility (reconfig)>" of the User Manual for more details.

<multest> can then be run under <armsd> as follows:

    > armsd -li multest
    A.R.M. Source-level Debugger, version 4.10 (A.R.M.) [Aug 26 1992]
    ARMulator V1.20, 512 Kb RAM, MMU present, Demon 1.01, FPE, Little endian.
    Object program file multest
    armsd: go
    Enter two unsigned 32-bit numbers in hex eg.(100 FF43D)
    12345678 10000001
    Least significant word of result is 92345678
    Most  significant word of result is  1234567
    Program terminated normally at PC = 0x00008418
          0x00008418: 0xef000011 .... : >  swi     0x11
    armsd: quit
    Quitting
    >

To convince yourself that __value_in_regs is being used try removing it from 
<mul64.h>, recompile <multest.c>, relink <multest>, and rerun <armsd>.  This 
time the answers returned will be incorrect, as the result is no longer 
expected to be returned in registers, but instead in a block of memory (ie. the 
code now has a bug).


Related Topics
..............

 *  "<Register Usage under the ARM Procedure Call Standard>"
    ;

 *  "<ARM6 Multiplier Performance Issues>".


In-Line SWIs
------------


About This Recipe
.................

This recipe shows how the ARM C Compiler can be used to generate in-line SWIs 
directly from C.


Introduction
............

The ARM instruction set provides the Software Interrupt (SWI) instruction to 
call Operating System routines.  It is useful to be able to generate such 
operating system calls from C without having to call hand crafted ARM Assembly 
Language to provide an interface between C and the SWI.

The ARM C Compiler provides a mechanism which allows many SWIs to be called 
efficiently from C.  SWIs which conform to the following rules can be compiled 
in-line,  without additional calling overhead:

 *  The arguments to the SWI (if any) must be passed in r0-r3 only.

 *  The results returned from the SWI (if any) must be returned in r0-r3 only.

The following sections demonstrate how to use the in-line SWI facility of <armcc> 
for a variety of different SWIs which conform to these rules.  These SWIs are 
taken from the ARM Debug Monitor interface, which is described in "<Standard 
Monitor SWIs>" of the Technical Specifications.

In the examples below, the following options are used with <armcc>:

    -li                 This specifies that the the target is a little endian 
                        ARM.

    -apcs 3/32bit       This specifies that the 32 bit variant of APCS 3 should 
                        be used.


Using a SWI which returns no result
...................................

For example: SWI_WriteC, which we want to be SWI number 0.

This SWI is intended to write a byte to the debugging channel.  The byte to be 
written is passed in r0.

The following C code, intended to write a Carriage Return / Line Feed sequence 
to the debugging channel, can be found in the <examples> directory as <newline.c>
:

    void __swi(0) SWI_WriteC(int ch);
    
    void output_newline(void)
    { SWI_WriteC(13);
      SWI_WriteC(10);
    }

Look carefully at the declaration of SWI_WriteC.  __swi(0) is the way in which 
the SWI_WriteC 'function' is declared to be in-line SWI number 0.

This code can be compiled to produce ARM Assembly Language source using:

    armcc -S -li -apcs 3/32bit newline.c -o newline.s

The code produced for the output_newline function is:

    output_newline
        MOV    a1,#&d
        SWI    &0
        MOV    a1,#&a
        SWI    &0
        MOV    pc,lr


Using a SWI which returns one result
....................................

Consider SWI_ReadC, which we want to be SWI number 4.

This SWI is intended to read a byte from the debug channel, returning it in r0.

The following C code, a naive read a line routine, can be found in the <examples> 
directory as <readline.c>:

    char __swi(4) SWI_ReadC(void);
    
    void readline(char *buffer)
    { char ch;
      do {
        *buffer++=ch=SWI_ReadC();
      } while (ch!=13);
      *buffer=0;
    }

Again, the way in which SWI_ReadC is declared should be noted: it is a function 
which takes no arguments and returns a char, and is implemented as in-line SWI 
number 4. 

This code can be compiled to produce ARM Assembler source using:

    armcc -S -li -apcs 3/32bit readline.c -o readline.s

The code produced for the readline function is:

    readline
        STMDB  sp!,{lr}
        MOV    lr,a1
    |L000008.J4.readline|
        SWI    &4
        STRB   a1,[lr],#1
        CMP    a1,#&d
        BNE    |L000008.J4.readline|
        MOV    a1,#0
        STRB   a1,[lr,#0]
        LDMIA  sp!,{pc}


Using a SWI which returns 2-4 results
.....................................

If a SWI returns two, three or four results then its declaration must specify 
that it is a struct-valued SWI, and the special keyword __value_in_regs must 
also be used.  This is because a struct valued function is usually treated much 
as if it were a void function with a pointer to where to return the struct as 
the first argument.  See "<Passing and Returning structs>" 
for more details.

As an example consider SWI_InstallHandler, which we want to be SWI number 0x70.

On entry r0 contains the exception number, r1 contains the workspace pointer, 
r2 contains the address of the handler.

On exit r0 is undefined, r2 contains the address of the previous handler and r1 
the previous handler's workspace pointer.

The following C code fragment demonstrates how this SWI could be declared and 
used in C:

    typedef struct SWI_InstallHandler_struct
    { unsigned exception;
      unsigned workspace;
      unsigned handler;
    } SWI_InstallHandler_block;
    
    
    SWI_InstallHandler_block 
      __value_in_regs  
        __swi(0x70) SWI_InstallHandler(unsigned r0, unsigned r1, unsigned r2);
    
    void InstallHandler(SWI_InstallHandler_block *regs_in,
                        SWI_InstallHandler_block *regs_out)
    { *regs_out=SWI_InstallHandler(regs_in->exception,
                                   regs_in->workspace,
                                   regs_in->handler);
    }

This code is provided in the <examples> directory as <installh.c>, and can be 
compiled to produce ARM Assembler source using:

    armcc -S -li -apcs 3/32bit installh.c -o installh.s 

The code which <armcc> produces is:

    InstallHandler
        STMDB  sp!,{lr}
        MOV    lr,a2
        LDMIA  a1,{a1-a3}
        SWI    &70
        STMIA  lr,{a1-a3}
        LDMIA  sp!,{pc}


The SWI Number is not Known Until Run Time
..........................................

If a SWI is to be called, but the number of the SWI is not known until run 
time, then the mechanisms discussed above are not appropriate.

This situation might occur when there are a number of related operations which 
can be performed on a object, and these various operations are implemented by 
SWIs with different numbers.

There are several ways to deal with this, including:

 *  The SWI instruction can be constructed from the SWI Number, stored 
    somewhere and then executed.

 *  A 'generic' SWI can be used which takes as an extra argument a code for the 
    actual operation to be performed on its arguments.  This 'generic' SWI must 
    then decode the operation and then perform it.

A mechanism has been added to <armcc> to support the second method outlined 
here.  The operation is specified by a value which is passed in r12 (ip).  The 
arguments to the 'generic' SWI are as usual passed in registers r0-r3, and 
values may optionally be returned in r0-r3 using the mechanisms described 
above.  The operation number passed in r12 may well be the number of the SWI to 
be called by the 'generic' SWI, but it need not be.

Here is an C fragment which uses a 'generic', or 'indirect' SWI:

    unsigned __swi_indirect(0x80)
        SWI_ManipulateObject(unsigned operationNumber, unsigned object,
                             unsigned parameter);
    
    unsigned DoSelectedManipulation(unsigned object, unsigned parameter,
                                    unsigned operation)
    { return SWI_ManipulateObject(operation, object, parameter);
    }

This code is provided in the <examples> directory as <swimanip.c>, and can be 
compiled to produce ARM Assembler source using:

    armcc -S -li -apcs 3/32bit swimanip.c -o swimanip.s 

The code which <armcc> produces is:

    DoSelectedManipulation
        MOV    ip,a3
        SWI    &80
        MOV    pc,lr


Related Topics
..............

 *  "<Register Usage under the ARM Procedure Call Standard>"
    ;

 *  "<Passing and Returning structs>";

 *  "<C Programming for Deeply Embedded Applications>" for 
    example programs which make use of in line swis.

 

 

