                  INCREASING THE SPEED OF C PROGRAMS

                            Matthew Probert

                           Servile  Software


In order to reduce the time your program spends executing it is 
essential to know your host computer. Most computers are very slow at 
displaying information on the screen. And the IBM PC is no exception 
to this. C offers various functions for displaying data, printf() 
being one of the most commonly used and also the slowest. Whenever 
possible try to use puts(varname) in place of printf("%s\n",varname). 
Remembering that puts() appends a newline to the string sent to the 
screen. 

When multiplying a variable by a constant which is a factor of 2 many 
C compilers will recognise that a left shift is all that is required 
in the assembler code to carry out the multiplication rapidly. When 
multiplying by other values it is often faster to do a multiple 
addition instead, so; 

    'x * 3' becomes 'x + x + x' 

Don't try this with variable multipliers in a loop because it becomes 
very slow! But, where the multiplier is a constant it can be faster. 
(Sometimes!) Another way to speed up multiplication and division is 
with the shift commands, << and >>. 

The instruction x /= 2 can equally well be written x >>= 1, shift the 
bits of x right one place. Many compilers actually convert integer 
divisions by 2 into a shift right instruction. You can use the shifts 
for multiplying and dividing by 2, 4, 8, 16, 32, 64, 128, 256, 512, 
1024 &c. If you have difficulty understanding the shift commands 
consider the binary form of a number; 

    01001101    equal to   77 

shifted right one place it becomes; 

    00100110    equal to   38 


Try to use integers rather than floating point numbers where ever 
possible. Sometimes you can use integers where you didn't think you 
could! For example, to convert a fraction to a decimal one would 
normally say; 

    percentage = x / y * 100 

This requires floating point variables. However, it can also be 
written as; 

    z = x * 100;
    percentage = z / y 

Which works fine with integers, so long as you don't mind the 
percentage being truncated. eg; 

    5 / 7 * 100 is equal to 71.43 with floating point 

but with integers; 

    5 * 100 / 7 is equal to 71 

(Assuming left to right expression evaluation. You may need to force 
the multiplication to be done first as with 'z = x * 100'). 

Here is a test program using this idea; 

float funca(double x, double y) 
{ 
    return(x / y * 100); 
} 

int funcb(int x,int y) 
{ 
    return(x * 100 / y); 
} 

void main()
{ 
    int n; 
    double x; 
    int y; 

    for(n = 0; n < 5000; n++)
    { 
        x = funca(5,7); 
        y = funcb(5,7); 
    } 
} 

And here is the results of the test program fed through a profiler 
using floating point emulation (no co-processor available); 

funca          1.9169 sec  96% |********************************************** 
funcb          0.0753 sec   3% |* 

You can clearly see that the floating point function is 25 times 
slower than the integer equivalent! 

NB: Although it is normal practice for expressions to be evaluated 
left to right, the ANSI standard on C does not specify an order of 
preference for expression evaluation, and as such you should check 
your compiler manual. 

Another way of increasing speed is to use pointers rather than array 
indexing. When you access an array through an index, for example with; 

    x = data[i];

the compiler has to calculate the offset of data[i] from the beginning 
of the array. A slow process. Using pointers can often improve things 
as the following two bubble sorts, one with array indexing and one 
with pointers illustrates; 

void BUBBLE()
{
    /* Bubble sort using array indexing */

    int a;
    int b;
    int temp;

    for(a = lastone; a >= 0; a--)
    {
        for(b = 0; b < a; b++)
        {
            if(data[b] > data[b + 1])
            {
                temp = data[b];
                data[b] = data[b + 1];
                data[b + 1] = temp;
            }
        }
    }
}

void PTRBUBBLE()
{
    /* Bubble sort using pointers */

    int temp;
    int *ptr;
    int *ptr2;

    for(ptr = &data[lastone]; ptr >= data; ptr--)
    {
        for(ptr2 = data; ptr2 < ptr; ptr2++)
        {
            if(*ptr2 > *(ptr2 + 1))
            {
                temp = *ptr2;
                *ptr2 = *(ptr2 + 1);
                *(ptr2 + 1) = temp;
            }
        }
    }
}

Here are the profiler results for the two versions of the same bubble 
sort operating on the same 1000 item, randomly sorted list; 

BUBBLE          3.1307 sec  59% |******************************************
PTRBUBBLE       2.1686 sec  40% |***************************


Here is another example of how to initialise an array using first the 
common indexing approach, and secondly the pointer approach; 

/* Index array initialisation */
int n;

for(n = 0; n < 1000; n++)
    data[n] = random(1000);


/* Pointer array initialisation */
int *n;

for(n = data; n < &data[1000]; n++)
    *n = random(1000);

Needless to say, the pointer approach is faster than the index. The 
pointer approach is only really of benefit when an array is going to 
be traversed, as in the above examples. In the case of say a binary 
search where a different and non-adjacent element is going to be 
tested each pass then the pointer approach is no better than using 
array indexing. 


The exception to this rule of using pointers rather than indexed 
access, comes with pointer to pointers. Say your program has declared 
a table of static data, such as: 

static char *colours[] = { "Black", "Blue", "Green", "Yellow", "Red", 
                           "White" };

It is faster to access the table with colours[n] than it is with a 
pointer, since each element in the table colours[], is a pointer. If 
you need to scan a string table for a value you can use this very fast 
approach instead; 

First the table is changed into a single string, with some delimiter 
between the elements. 

static char *colours = "Black/Blue/Green/Yellow/Red/White";

Then to confirm that a value is held in the table you can use 
strstr(); 

result = strstr(colours,"Cyan");


Using in-line assembler code can provide the greatest speed increase. 
Care must be taken however not to interfere with the compiled C code. 
It is usually safe to write a complete function with in-line assembler 
code, but mixing in-line assembler with C code can be hazardous. As a 
rule of thumb, get your program working without assembler code, and 
then if you want to use in-line assembler, convert small portions of 
the code at a time, testing the program at each stage. Video I/O is a 
very slow process with C, and usually benefits from in-line assembler.
Look especially for occasions where INT86() is used and try to replace 
the code with in-line assembler.
