ARM Instruction Set and Processor Features
==========================================

The ARM instruction set has the following key features, some of which are 
common to many other processors, and some of which are not:

 *  Load/Store architecture (only load and store instructions access memory).

 *  32 bit instructions, 32/8 bit data words/bytes.

 *  32 bit addresses (26 bit on earlier ARMs).

 *  15 general purpose 32 bit registers, program counter and program status 
    register - a subset of these are banked, to give rapid context switching 
    for interrupt and supervisor modes. (See the appropriate ARM Data Sheet for 
    details of particular processors).

 *  Flexible store multiple and load multiple instructions allow any set of 
    registers from a single bank to be transferred to/from memory by a single 
    instruction.

 *  There is no single instruction to move an immediate 32 bit value to a 
    register (in general, a literal value has to be loaded from memory).  
    However, a large set of common 32-bit values <can> be generated in a single 
    instruction.

 *  All instructions are executed conditionally on the state of the current 
    program status register.  Only data processing operations with the S bit 
    set change the state of the current program status register.

 *  The second argument to all data-processing and single data-transfer 
    operations can be shifted in quite a general way before the operation is 
    performed. This supports, but is not limited to, scaled addressing, 
    multiplication by a small constant, and construction of constants, within a 
    single instruction.

 *  Co-processor instructions support a general way to extend the ARM's 
    architecture in a customer-specific manner.

In addition, the ARM processor has:

 *  Support for Big- or Little-Endian memory.

 *  A powerful barrel shifter to support ARM's within-instruction shifts.

The recipes in this chapter discuss some of these features in greater detail.


Making the Most of Conditional Execution
----------------------------------------


About this Recipe
.................

In this recipe you learn how conditional execution can eliminate branch 
instructions, producing smaller and faster code. Euclid's Greatest Common 
Divisor algorithm is used for illustrative purposes. Specifically, you will 
learn how to use:

 *  conditional execution;

 *  the 'S' bit in ARM data processing instructions.


The ARM's ALU Status Flags
..........................

The ARM's Program Status Register contains, among other flags, copies of the 
ALU status flags:

    N             Negative result from ALU flag
    Z             Zero result from ALU flag
    C             ALU operation Carried out
    V             ALU operation oVerflowed


Execution Conditions
....................

Every ARM instruction has a 4 bit field which encodes the conditions under 
which it will be executed.  These conditions refer to the state of the ALU N, 
Z, C and V flags as follows:

    EQ            Z set (equal)
    NE            Z clear (not equal)
    CS/HS         C set (unsigned >=)
    CC/LO         C clear (unsigned <)
    MI            N set (negative)
    PL            N clear (positive or zero)
    VS            V set (overflow)
    VC            V clear (no overflow)
    HI            C set and Z clear (unsigned >)
    LS            C clear and Z set (unsigned <=)
    GE            N and V the same (signed >=)
    LT            N and V differ (signed <)
    GT            Z clear, N and V the same (signed >)
    LE            Z set, N and V differ (signed <=) 
    AL            Always execute (the default if none is specified)


Setting the ALU Flags in the PSR
................................

Data processing instructions change the state of the ALU's N,Z,C and V status 
outputs but these are latched in the PSR'S ALU flags only if a special bit (the 
'S' bit) is set in the instruction.


Illustration - Euclid's GCD Algorithm
.....................................

The following code fragment is extracted from <gcd.c>, which can be found in 
the <examples> directory.

    while (a != b)
    { if (a > b) a -= b;
      else       b -= a;
    }

Without conditional execution this could be naively coded as:

    gcd CMP    a1, a2
        BEQ    end
        BLT    lessthan
        SUB    a1, a1, a2
        B      gcd
    lessthan
        SUB    a2, a2, a1
        B      gcd
    end 

Conditional execution and selective setting of the PSR'S ALU flags allows it to 
be coded much more compactly as follows (this version can be found in the 
<examples> directory as <gcd.s>).

    gcd CMP    a1, a2

        SUBGT  a1, a1, a2

        SUBLT  a2, a2, a1

        BNE    gcd

Two 'tricks' are illustrated:

 *  The CMP instruction (implicitly) has the 'S' bit set, so the result of the 
    comparison sets the PSR ALU status flags.  However, the following  two 
    subtractions do not have the 'S' bit set, so they do not affect the PSR ALU 
    status flags which remain in the state set by the earlier CMP instruction 
    when the BNE instruction is executed.  The test (a != b) has been combined 
    with the branch back to the top of the loop, giving shorter code, and in 
    many instances code which runs more quickly.

 *  The two subtractions are executed only if the condition specified is met, 
    so two branches around these instructions can be avoided.  In addition to 
    the obvious benefit of smaller code, any pipeline refill caused by the 
    branches will also have been avoided.


Running the C Example
.....................

You can run the C <gcd> routine shown above under <armsd>.  To do this first 
set your current directory to the <examples> directory.

Compile, link and run the C version of the <gcd> routine by using the following 
commands:

    armcc -c gcd.c -li -apcs 3/32bit
    armcc -c gcdtest.c -li -apcs 3/32bit
    armlink -o gcdtest gcd.o gcdtest.o <somewhere>/armlib.32l
    armsd -li gcdtest

where <somewhere> is the directory in which <armlib.32l> can be found.


Explanation

The two <armcc> commands compile the gcd function and the test harness, 
creating relocatable object files <gcd.o> and <gcdtest.o>.  The -li flag tells 
<armcc> to compile for a little-endian memory. The -apcs 3/32bit option tells 
<armcc> to use a 32 bit version of the ARM Procedure Call Standard.  You can 
omit these options if your <armcc> has been configured for this default (see "
<The ARM Tool Reconfiguration Utility (reconfig)>" of the 
User Manual for how to configure the ARM software tools).

The <armlink> command links your relocatable objects with the ARM C library to 
create a runnable program (here called <gcdtest>).

The <armsd> command invokes the debugger, with <gcdtest> as the program to be 
run.  Again -li specifies that little-endian memory is required (as with <armasm> 
above).  For more details on running programs under <armsd> see "<The ARM 
Symbolic Debugger (armsd)>" of the User Manual and "<armsd 
Command Language>" of the User Manual.


Running the Assembler Example
.............................

You can run the <gcd> routine shown above under <armsd>.  To do this first set 
your current directory to the <examples> directory.

You can assemble, link and run the assembler <gcd> routine by using the 
following commands:

    armasm gcd.s -o gcd.o -li
    armcc -c gcdtest.c -li -3/32bit
    armlink -o gcdtest gcd.o gcdtest.o <somewhere>/armlib.32l
    armsd -li gcdtest

where <somewhere> is the directory in which <armlib.32l> can be found.


Explanation

The <armasm> command assembles the <gcd> function, creating the relocatable 
object file <gcd.o>.  The -li flag tells <armasm> to assemble for a 
little-endian memory.  The -apcs 3/32bit option tells <armcc> to use a 32 bit 
version of the ARM Procedure Call Standard.  You can omit these options if your 
<armasm> has been configured for this default  (see "<The ARM Tool 
Reconfiguration Utility (reconfig)>" of the User Manual for 
how to configure the ARM development tools).

The <armcc> command compiles the test harness.  The -c flag tells <armcc> not 
to link its output with the C library; the -li flag tells <armcc> to compile 
for a little-endian memory (as with <armasm>).

The <armlink> command links your relocatable objects with the ARM C library to 
create a runnable program (here called <gcdtest>).

The <armsd> command invokes the debugger, with <gcdtest> as the program to be 
run.  Again -li specifies that little-endian memory is required (as with <armasm> 
above).  For more details on running programs under <armsd> see "<The ARM 
Symbolic Debugger (armsd)>" of the User Manual and "<armsd 
Command Language>" of the User Manual.


Related Topics
..............

 *  There are many examples of code which makes good use of the ARM's condition 
    codes and 'S' bit in recipes in chapter "<Exploring ARM Assembly Language>" 
   .


Using the Barrel Shifter
------------------------


About This Recipe
.................

In this recipe you learn:

 *  how to index into an array efficiently in ARM assembler.

 *  how to use the barrel shifter in the main ARM instruction classes;


Addressing an Entry in a Table of Words
.......................................

The following piece of code inefficiently calculates the address of an entry in 
a table of words and then loads the desired word:

        ; R0 holds the entry number [0,1,2,...]

        LDR  R1, =StartOfTable
        MOV  R3, #4
        MLA  R1, R0, R3, R1
        LDR  R2, [R1]
        ...
    StartOfTable
        DCD <table data>

Loading the desired table entry is performed by first loading the start address 
of the table, then moving the immediate constant "4" into a register, using the 
multiply and add instruction to calculate the address, and finally loading the 
entry.  However, this operation can be performed by the barrel shifter more 
efficiently as follows:

        ; R0 holds the entry number [0,1,2,...]
        LDR  R1, =StartOfTable
        LDR  R2, [R1, R0, LSL #2]
        ...
    StartOfTable
        DCD <table data>

In this code the barrel shifter shifts R0 left 2 bits (ie. multiplying it by 
4), this intermediate value is then used as the index for the LDR instruction.  
Thus a single instruction is used to perform the whole operation.  Such 
significant savings can frequently be made by making good use of the barrel 
shifter.


The ARM's Barrel Shifter
........................

The ARM core contains a Barrel shifter which takes a value to be shifted or 
rotated, an amount to shift or rotate by and the type of shift or rotate. This 
can be used by various classes of ARM instructions to perform comparatively 
complex operations in a single instruction.  On ARMs up to and including the 
ARM6 family, instructions take no longer to execute by making use of the barrel 
shifter, unless the amount to be shifted is specified by a register, in which 
case the instruction will take an extra cycle to complete.

The barrel shifter can perform the following types of operation:

    LSL           shift left by n bits;

    LSR           logical shift right by n bits;

    ASR           arithmetic shift right by n bits (the bits fed into the top 
    end of the
                  operand are copies of the original top (or sign) bit);

    ROR           rotate right by n bits;

    RRX           rotate right extended by 1 bit.  This is a 33 bit rotate, 
    where the
                  33rd bit is the PSR C flag.

The barrel shifter can be used in several of the ARM's instruction classes.  
The options available in each case are described below.


LDR/STR
.......

The index can be a register shifted by any 5 bit constant.  It may also be an 
unshifted 12 bit constant. eg.

    STR  R7, [R0], #24         ; Post-indexed
    LDR  R2, [R0], R4, ASR #4  ; Post-indexed
    STR  R3, [R0, R5, LSL #3]  ; Pre-indexed
    LDR  R6, [R0, R1, ROR #6]! ; Pre-indexed + Writeback


Explanation

In all of the above instructions R0 is the base register.

In the pre-indexed instructions the offset is calculated and added to the base.  
This address is used for the transfer.  If writeback is selected, then the 
transfer address is written back into the base register.

In the post-indexed instructions the offset is calculated and added to the base 
after the transfer.  The base register is always updated by post-indexed 
instructions.


Data Processing Operations
..........................

The last operand (the second for binary operations, and the first for unary 
operations) may be:

 *  an 8 bit constant rotated right through an even number of positions. eg.

          ADD R0, R1, #&C5, 10
          MOV R5, #&FC000003

    Note that in the second example the assembler is left to work out how to 
    split the constant &FC000003 into an 8 bit constant and an even shift (in 
    this case "#&FC000003" could be replaced by "#&FF, 6").  See "<Loading 
    Constants into Registers>" for more information.

 *  a register (optionally) shifted or rotated either by a 5-bit constant or by 
    another register. eg.

          ADD R0, R1, R2
          SUB R0, R1, R2, LSR #10
          CMP R1, R2, R1, ROR R5
          MVN R3, R2, RRX


Program Status Register Transfer Instructions
.............................................

For the precise format of these instructions see the appropriate datasheet. 


Related Topics
..............

For more examples which make good use of the barrel shifter see many of the 
recipes in chapter "<Exploring ARM Assembly Language>".

The following cover loading constants into registers, and explain how <armasm> 
can help out the assembly language programmer:

 *  "<MOV / MVN>";

 *  "<LDR Rd, =numeric constant>".


Flexibility of Load and Store Multiple
--------------------------------------


About this Recipe
.................

In this recipe you learn about:

 *  the benefits and capabilities of the load and store multiple instructions;

 *  types of stacks supported directly by load and store multiple.


Multiple vs Single Transfers
............................

The Load and Store Multiple instructions provide a way to efficiently move the 
contents of several registers to and from memory.  The advantages of using a 
single load or store multiple instruction over a series of load or store single 
instructions are:

 *  Smaller code size;

 *  On Von Neumann architectures such as all ARMs up to the ARM6 family, there 
    is only a single instruction fetch overhead, rather than many instruction 
    fetches.  

 *  On Von Neumann architectures, only one register write back cycle is 
    required for a load multiple, as opposed to one for every load single;

 *  On uncached ARM processors, the first word of data transfered by a load or 
    store multiple will always be a non-sequential memory cycle, but all 
    subsequent words transferred can be sequential (faster) memory cycles. 


The Register List
.................

The registers the load and store multiple instructions transfer are encoded 
into the instruction by one bit for each of the registers R0 to R15.  A set bit 
indicates the register will be transferred, and a clear bit indicates that it 
will not be transferred.  Thus it is possible to transfer any subset of the 
registers in a single instruction.

The way the subset of registers to be transferred is specified is simply by 
listing those registers which are to be transferred in curly brackets eg.

    {R1, R4-R6, R8, R10}


Increment / Decrement, Before / After
.....................................

The base address for the transfer can either be incremented or decremented 
between register transfers, and this can happen either before or after each 
register transfer.  eg.

    STMIA R10, {R1, R3-R5, R8}

The suffix IA could also have been IB, DA or DB, where I indicates increment, D 
decrement, A after and B before.


Base Register Writeback
.......................

In the last instruction, although the address of the transfer was changed after 
each transfer, the base register was not updated at any point. Register 
writeback can be specified so that the base register is updated. Clearly the 
base register will change by the same amount whether "before" or "after" is 
selected.  An example of a load multiple using base writeback is:

    LDMDB R11!, {R9, R4-R7}


Note

In all cases the lowest numbered register is transferred to or from the lowest 
memory address, and the highest numbered register to or from the highest 
address.  [The order in which the registers are listed in the register list 
makes no difference.  Also, the ARM always performs sequential memory accesses 
in increasing memory address order.  Therefore 'decrementing' transfers 
actually perform a subtraction first and then increment the transfer address 
register by register].


Stack Notation
..............

Since the load and store multiple instructions have the facility to update the 
base register (which for stack operations can be the stack pointer), these 
instructions provide single instruction push and pop operations for any number 
of registers.  Load multiple being pop, and store multiple being push.

There are several types of stack which the Load and Store Multiple Instructions 
can be used with:

 *  Ascending or descending stacks.  ie. the stack grows up memory or down 
    memory.  [Sometimes a pair of stacks, one of which grows up memory and one 
    of which grows downwards are used - thus choosing the direction is not 
    always just a matter of taste].

 *  Empty or Full stacks.  The stack pointer can either point to the top item 
    in the stack (a full stack), or the next free space on the stack (an empty 
    stack).

As stated above, pop and push operations for these stacks can be implemented 
directly by load and store multiple instructions.  To make it easier for the 
programmer special stack sufficies can be added to the LDM and STM instructions 
(as an alternative to Increment / Decrement and Before / After sufficies) as 
follows:

    STMFA R10!, {R0-R5}   ; Push R0-R5 onto a Full Ascending Stack
    LDMFA R10!, {R0-R5}   ; Pop  R0-R5 from a Full Ascending Stack
    
    STMFD R10!, {R0-R5}   ; Push R0-R5 onto a Full Descending Stack
    LDMFD R10!, {R0-R5}   ; Pop  R0-R5 from a Full Descending Stack
    
    STMEA R10!, {R0-R5}   ; Push R0-R5 onto an Empty Ascending Stack
    LDMEA R10!, {R0-R5}   ; Pop  R0-R5 from an Empty Ascending Stack
    
    STMED R10!, {R0-R5}   ; Push R0-R5 onto an Empty Descending Stack
    LDMED R10!, {R0-R5}   ; Pop  R0-R5 from an Empty Descending Stack


Related Topics
..............

For more information on using stacks in assembly language see "<Stacks in 
Assembly Language>".

For further discussion of some of the benefits which can be gained by using LDM 
and STM see "<Loop Unrolling>".


Loading Constants into Registers
--------------------------------


About this Recipe
.................

This recipe explains and demonstrates:

 *  Why loading constants / addresses is an issue on the ARM;

 *  How to solve it using MOV / MVN;

 *  How to solve it using LDR Rd, =<expression>

 *  How to solve it using ADR and ADRL


Why is Loading Constants an issue ?
...................................

Since all ARM instructions are precisely 32 bits long, and ARM instructions do 
not use the instruction stream as data, there is no single instruction which 
will load any 32 bit immediate constant into a register without performing a 
data load from memory.

However, there are ways to load many commonly used constants into a register 
without resorting to a data load from memory.  Of course, a data load from 
memory allows any 32-bit value to be loaded into a register, but the added 
expense of a data load can often be avoided.

The assembler provides several 'instruction extensions', and two pseudo 
instructions to make the efficient loading of constants and addresses 
non-painful.


MOV / MVN
.........

As described in the recipe "<Using the Barrel Shifter>", the 
MOV and MVN instructions allow many constants to be constructed.  The constants 
which these instructions can construct must be eight bit constants rotated 
right through an even number of positions.  By using MVN the bitwise complement 
of such values can also be constructed.

Having to convert a constant into this form is an onerous task no-one wants to 
do.  Therefore <armasm> will do this automatically.  Either MOV or MVN may be 
used with a constant which can be constructed using either of these 
instructions.  <armasm> will choose the correct instruction and construct the 
constant.  If it is impossible to construct the desired constant <armasm> will 
report this as an error.

To illustrate this, look at the following MOV and MVN instructions.  The 
instruction listed in the comment is the ARM instruction which is produced by 
<armasm>.

    MOV R0, #0            ; => MOV R0, #0
    MOV R1, #&FF000000    ; => MOV R1, #&FF, 8 
    MOV R2, #&FFFFFFFF    ; => MVN R2, #0
    MVN R0, #1            ; => MVN R0, #1
    MOV R1, #&FC000003    ; => MOV R1, #&FF, 6
    MOV R2, #&03FFFFFC    ; => MVN R2, #&FF, 6
    MOV R3, #&55555555    ; Reports an error (it cannot be constructed)


Assembling the Example
......................

The above code is available in <loadcon1.s> in the <examples> directory.  To 
assemble it first set the current directory to <examples> and then issue the 
command:

    armasm loadcon1.s -o loadcon1.o -li

To confirm that <armasm> produced the correct code, the code area can be 
disassembled by looking at the output from:

    decaof -c loadcon1.o


Explanation

The -li argument can be omitted if the tools have been configured 
appropriately.  See "<The ARM Tool Reconfiguration Utility (reconfig)>" 
starting on page45 of the User Manual for details.

<decaof> is the ARM Object Format decoder.  The -c option requests that decaof 
dissassemble the code area.


LDR Rd, =numeric constant
.........................

<armasm> provides a mechanism which unlike MOV and MVN can construct any 32-bit 
numeric constant, but which may not result in a data processing operation to do 
it.  This is the "LDR Rd, =" mechanism.

If the numeric constant can be constructed by using either MOV or MVN, then 
this will be the instruction used to load the constant.  If this cannot be 
done, however, <armasm> will produce an LDR instruction to read the constant 
from a literal pool.


Literal Pools
.............

A literal pool is a portion of memory set aside for constants.  By default a 
literal pool is placed right at the end of the program.  However, for large 
programs, this literal pool may not be accessible throughout the program (due 
to the LDR offset being a 12 bit value), so further literal pools can be placed 
using the LTORG directive.

When the "LDR, Rd, =" mechanism needs to access a literal in a literal pool, 
<armasm> first checks previously encountered literal pools to see if the 
desired constant is already available and addressable.  If it is then this 
literal is addressed, otherwise <armasm> will attempt to place the literal in 
the next available literal pool.  If this literal pool is not addressable then 
an error will result, and an additional LTORG should be placed close to  (but 
after) the failed "LDR Rd,=" instruction.

Although this may sound somewhat complicated, in practice, it is simple to use.  
Consider the following example, which demonstrates how literal pools and "LDR 
Rd,=" work.  The instruction listed in the comment is the ARM instruction which 
gets produced by <armasm>.

      AREA Example, CODE, REL
    
      LDR R0, =42         ; => MOV R0, #42
      LDR R1, =&55555555  ; => LDR R1, [PC, #offset to Literal Pool 1]
      LDR R2, =&FFFFFFFF  ; => MVN R2, #0
    
      LTORG               ; Literal Pool 1 contains literal &55555555
    
      LDR R3, =&55555555  ; => LDR R3, [PC, #offset to Literal Pool 1]
    ; LDR R4, =&66666666  ; If this is uncommented it will fail, as 
                          ; Literal Pool 2 is not accessible (out of reach)
    
    LargeTable2 % 4200
    
      END                 ; Literal Pool 2 is empty


Assembling the Example
......................

The above code is available in <loadcon2.s> in the <examples> directory.  To 
assemble it first set the current directory to <examples> and then issue the 
command:

    armasm loadcon2.s -o loadcon2.o -li

To confirm that <armasm> produced the correct code, the code area can be 
disassembled by looking at the output from:

    decaof -c loadcon2.o


Explanation

The -li argument can be omitted if the tools have been configured 
appropriately.  See  "<The ARM Tool Reconfiguration Utility (reconfig)>" 
starting on page45 of the User Manaul for details.

<decaof> is the ARM Object Format decoder.  The -c option requests that decaof 
dissassemble the code area.


LDR Rd, =PC relative expression
...............................

As well as numeric constants, the "LDR Rd, =" mechanism can cope with PC 
relative expressions, such as labels.

Even if a PC relative ADD or SUB could be constructed, an LDR will be generated 
to load the PC relative expression.  Thus if a PC relative ADD or SUB is 
desired then ADR should be used instead (see "<ADR and ADRL>" starting on page
18).  If no suitable literal is already available, then the literal placed into 
the next literal pool will be the offset into the AREA, and an AREA relative 
relocation directive will be added to ensure that the constant is appropriate 
wherever the containing AREA gets located by the linker.  See "<The Handling of 
Relocation Directives>" of the Reference Manual for more 
information about relocation directives.

As an example consider the code below.  The instruction listed in the comment 
is the ARM instruction which gets produced by <armasm>.

      AREA Example, CODE, REL
    
    Start
      LDR R0, =Start                ; => LDR R0, [PC, #offset to Litpool 1
      LDR R1, =DataArea + 12        ; => LDR R1, [PC, #offset to Litpool 1
      LDR R2, =DataArea + 6000      ; => LDR R2, [PC, #offset to Litpool 1
    
      LTORG                         ; Literal Pool 1 holds three literals
    
      LDR R3, =DataArea + 6000      ; => LDR R2, [PC, #offset to Litpool 1
                                    ; (sharing with previous literal)
    ; LDR R4, =DataArea + 6004      ; If uncommented will produce an error
                                    ; as Litpool 2 is out of range
    
    DataArea % 8000
    
      END                           ; Literal Pool 2 is out of range of
                                    ; the LDR instructions above


Assembling the Example
......................

The above code is available in <loadcon3.s> in the <examples> directory.  To 
assemble it first set the current directory to <examples> and then issue the 
command:

    armasm loadcon3.s -o loadcon3.o -li

To confirm that <armasm> produced the correct code, the code area can be 
disassembled by looking at the output from:

    decaof -c loadcon3.o


Explanation

The -li argument can be omitted if the tools have been configured 
appropriately.  See "<The ARM Tool Reconfiguration Utility (reconfig)>" 
starting on page45 of the User Manual for details.

<decaof> is the ARM Object Format decoder.  The -c option requests that decaof 
dissassemble the code area.


ADR and ADRL
............

Sometimes it is important for efficiency purposes that loading an address does 
not perform a memory access.  The assembler provides two pseudo instructions 
which make it easier to do this.

Whereas MOV and MVN only accept numeric constants, ADR and ADRL accept numeric 
constants, PC relative expressions (labels within the same area) and register 
relative expressions.

ADR will attempt to produce a single data processing instruction to load an 
address into a register.  This instruction will be one of MOV, MVN, ADD or SUB, 
in the same way as the "LDR Rd, =" mechanism produces instructions. If the 
desired address cannot be constructed in a single instruction an error will be 
produced.

ADRL will attempt to produce either two data processing instructions to load an 
address into a register.  Even if it is possible to produce a single data 
processing instruction to load the address into the register then a second, 
redundant instruction will be produced (this is a consequence of the strict 
two-pass nature of <armasm>) .  In cases where it is not possible to construct 
the address using two data processing instructions ADRL will produce an error - 
the LDR, = mechanism is probably the best option in this case.

As an example consider the code below.  The instructions listed in the comments 
are the ARM instruction which are produced by <armasm>.

      AREA Example, CODE, REL
    
    Start
      ADR  R0, &8000              ; => MOV R0, #&8000
    ; ADR  R1, &8001              ; This would fail as it cannot be
                                  ; constructed by a MOV or MVN
      ADR  R2, Start              ; => SUB R2, PC, #offset to Start
      ADR  R3, DataArea           ; => ADD R3, PC, #offset to DataArea
    ; ADR  R4, DataArea+4300      ; This would fail as the offset is cannot
                                  ; be expressed by operand2 of an ADD
      ADRL R5, DataArea+4300      ; => ADD R5, PC, #offset1
                                  ;    ADD R5, R5, #offset2
      ADRL R6, &8001              ; => MOV R6, #1
                                  ;    ADD R6, R6, #&8000
    ; ADRL R7, &55555555          ; This would fail as the constant cannot
                                  ; be constructed by 2 data processing
                                  ; instructions
    DataArea % 8000
    
      END


Assembling the Example
......................

The above code is available in <loadcon4.s> in the <examples> directory.  To 
assemble it first set the current directory to <examples> and then issue the 
command:

    armasm loadcon4.s -o loadcon4.o -li

To confirm that <armasm> produced the correct code, the code area can be 
disassembled by looking at the output from:

    decaof -c loadcon4.o


Explanation

The -li argument can be omitted if the tools have been configured 
appropriately.  See "<The ARM Tool Reconfiguration Utility (reconfig)>" 
starting on page45 of the User Manual for details.

<decaof> is the ARM Object Format decoder.  The -c option requests that decaof 
dissassemble the code area.


Related topics
..............

For more information on the capabilities of the barrel shifter see "<Using the 
Barrel Shifter>".

 

 

