Floating Point Emulator Changes between Release 1.5 and Release 1.6
===================================================================

The version of the floating point emulator in releases up to 1.5 has been
superceded in release 1.6 by a new FPA10 compatible floating point emulator.

There are several major changes introduced in this version which users should
be aware of:

 *  Extended precision is now an IEEE-754 double-extended format.  This means
    that LDF/STF pairs may now cause exceptions if the value being stored is
    a NaN ("Not a Number", an IEEE-754 term), thus LDF and STF are no longer
    safe instructions to use for preserving floating point register contents.
    Instead the new LFM and SFM instructions should be used.

 *  New LFM and SFM instructions are available which are guaranteed never to
    cause an exception. They are generated by the compiler in its shipped
    state.  They are also used by the ARM C Library by default as shipped
    with this release.

 *  There are further instruction set extensions, which are only likely to be
    of interest to users who are highly interested in writing floating point
    code in assembly language, or in making full use of the FPA10.  Full
    details are given in the FPA10 datasheet.


Consequences of the new Floating Point Emulator
-----------------------------------------------

The following tools are affected by the new FPE:

armcc   armcc configured as supplied produces LFM and SFM instructions to
        preserve floating point registers, and thus produces code which
        requires the new floating point emulator.  This can be reconfigured
        (or changed on the cli).  See the armcc section and the reconfig
        section in the User Manual for details.

C libraries     The version supplied pre-built contains LFM and SFM
        instructions to preserve floating point registers, and thus requires
        use of the new floating point emulator.  The library must be rebuilt
        (changing FPIS=3 to FPIS=2 in the makedefs file) if the old FPE is to
        be used.

armsd -armul, armwd -armul
       The ARMulator has the new floating point emulator built in.

armsd -serial, armwd -serial
       Debug Monitor ROM's built with previous releases of the ARM Software
       Toolkit will have the old FPE built in, and thus will not support the
       LFM and SFM instructions used by code compiled with armcc from this
       release as supplied by default, or with the C library as supplied by
       default.  It is therefore advised that a new DeMon ROM be built
       containing the new FPE.  This can be done using this release of
       the ARM Software Toolkit. 

It is advised that unless users have a very good reason, that the new
floating point emulator be used rather than the old one supplied in previous
releases.


The New Load and Store Multiple Floating Instructions
-----------------------------------------------------

The following specification of the LFM and SFM instructions is taken from the
FPA10 Datasheet:

    31..28 27..24  23  22 21  20 19..16 15..12 11..8 7......0
    Cond    110P  U/D  Y  Wb  L/S  Rn   X Fd   0010   Offset 

Cond   	Condition field 
P      	Pre/post indexing bit (0=post; 1=pre)
U/D    	Up/down bit (0=down; 1=up) 
Y      	Register count (see below) 
Wb     	Write-back bit 
L/S    	Load/store bit (0=store to memory; 1=load from memory) 
Rn     	Base register 
X      	Register count (see below)
Fd     	Floating point register number offset - unsigned 8 bit immediate
        offset

Description

The Load/Store Multiple Floating instructions allow between 1 and 4 floating
point registers to be transferred from/to memory in a single operation. These
operations allow groups of registers to be saved and restored efficiently
(e.g. across context switches).

The values are transferred as three words of data for each register; the data
format used is not defined (and may change in future implementations), and
the only legal operation that can be performed on this data is to load it
back into the FPA using the same implementation's LFM instruction. The data
stored in memory by an SFM instruction should not be used or modified by any
user process.

Note that coprocessor number 2 (bits 11-8 in the instruction field) rather
than the usual FPA coprocessor number of 1 must be used for these
instructions.

The offset in bits [7:0] is specified in words and is added to (U/D=1) or
subtracted from (U/D=0) a base register (Rn), either before (P=1) or after
(P=0) the base is used as the transfer address. The modified base value may
be written back into the base register (Wb=1) or the old value of the base
may be preserved (Wb=0). Note that post-indexed addressing modes require
explicit setting of the Wb bit, unlike LDR and STR which always write-back
when post-indexed. The value of the base register, modified by the offset in
a pre-indexed instruction, is used as the address for the transfer of the
first word. The second word will go to or come from an address one word (4
bytes) higher than the first transfer, and the address will be incremented by
one word for each subsequent transfer.

Assembler Syntax

There are two alternative forms:

(1)    <LFM|SFM>{cond} Fd,<count>,[Rn]
                                  [Rn, #<expression>]{!}     
                                  [Rn],#<expression>

The first register to transfer is specified as Fd.

The number of registers to transfer is specified in the <count> field and is
encoded in Y (bit 22) and X (bit 15) as follows:

    Y  X  No. regs. to xfer
    0  1         1
    1  0         2
    1  1         3
    0  0         4

Registers are always transferred in ascending order and wrap around at
register F7. For example:

SFM F6,4,[R0]   will transfer F6,F7,F0,F1 to memory starting at the address
                contained in register R0.

Pre-indexed addressing specification: 

[Rn] - offset of zero

[Rn, #<expression>]{!} - offset of <expression> bytes

{!} Write back the base register (set the Wb bit) if ! is present.

If Rn is R15, writeback should not be specified.   

Post-indexed addressing specification:

[Rn],#<expression> - offset of <expression> bytes

Note that the assembler automatically sets the Wb bit in this case.   R15
should not be used as the base register where post-indexed addressing is
used.                                               

Note that the <expression> must be divisible by 4 and be in the range -1020
to 1020.
                  

(2)      <LFM|SFM>{cond}<FD,EA> Fd,<count>,[Rn]{!}

This form of the instruction is intended for stacking type operations on the
floating point registers. The following table shows how the assembler
mnemonics translate into bits in the instruction.   

        Name            Stack   L bit   P bit   U bit
    post-inc load       LFMFD     1       0       1
    pre-dec  load       LFMEA     1       1       0
    post-inc store      SFMEA     0       0       1
    pre-dec  store      SFMFD     0       1       0
   
FD,EA define pre/post indexing and the up/down bit by reference to the form
of stack required. The F and E refer to a "full" or "empty" stack, i.e.
whether a pre-index has to be done (full) before storing to the stack. The A
and D refer to whether the stack is ascending or descending. If ascending, an
SFM will go up and LFM down; if descending, vice-versa. Note that only EA and
FD are permitted: the LFM/SFM instructions are not capable of supporting
empty descending or full ascending stacks.

  {!} Write back the base register (set the Wb bit) if ! is present.     

If Rn is R15, writeback should not be specified.  
