I'm pretty comfortable with "C", but where I lack, is understanding what types of options must be passed to compiler to produce the tightest code possible. Here is an example of a formula in "C";

int ADDR = VIDEO_BASE + (WND_X + POS_X) + ((WND_Y + POS_Y) * SCR_X) * 2 and the resulting assember code


        push    rcx
        xor     eax, eax

    ; Determine vertical offset in characters.

        mov      al, [rbx + WND_Y]      ; Get windows vertical offset from top left
        add      al, [rbx + POS_Y]
        sub      al, 2                  ; Zero index co-ordinates
        mul     byte [rbx + SCR_X]      ; Number of columns / row

    ; Determine horizonal offset in characters.

        mov     rcx, rax
        xor     eax, eax
        mov      al, [rbx + WND_X]
        add      al, [rbx + POS_X]
        sub      al, 2                  ; Zero index co-ordinates

    ; Result double as each position has character and associated attribute.

        add     eax, ecx
        shl     rax, 1                  ; Offset *= 2
        add     rax, [rbx + VIDEO_BASE]

        pop     rcx

and resultant code looks like this

00  51                push rcx
01  31C0              xor eax,eax
03  8A430D            mov al,[rbx+0xd]
06  024303            add al,[rbx+0x3]
09  2C02              sub al,0x2
0B  F623              mul byte [rbx]
0D  4889C1            mov rcx,rax
10  31C0              xor eax,eax
12  8A430C            mov al,[rbx+0xc]
15  024302            add al,[rbx+0x2]
18  2C02              sub al,0x2
1A  01C8              add eax,ecx
1C  48D1E0            shl rax,1
1F  48034304          add rax,[rbx+0x4]
23  59                pop rcx
24  C3                ret

GCC is probably what I'm going to use or at least its variant for windows. Unless there is another option for windows other than visual studio. It would be nice to see the disassembled code that corresponds to your response.

3 Years
Discussion Span
Last Post by ShiftLeft

You want the "tightest" code which adds a few variables, and multiplies those values two times?

The assembly code the compiler generates will vary, but the execution time and overall size, will be remarkably similar. There aren't many ways a compiler will take to add or multiply.

If you are concerned about minimizing the run-time of your program, this is not the way to do it. You are adding and multiplying a few variables. What possible optimization do you expect to find or receive?

Your programs first priority is always accuracy. Why? Because if you can settle for a wrong answer, a simpleton program can deliver it instantly, everytime.

First then, get the program accurate in it's results. THEN concentrate on run-time optimizing. There are excellent profiling tools that can show you just where your program is spending the majority of it's time. Find those bottlenecks, and see what can be done with streamlining them.

Edited by Adak


The gcc variants for Windows are MingW, and the gcc native compilers in Cygwin. In any case, RTFM with regard to optimization settings (-ON where N is the optimization level). As for you specific code, the only thing I can see at the C level (sorry - bad pun) is the multiplication by 2. That can be reduced to a left-shift by 1 bit. IE,

/* Instead of this */
int ADDR = VIDEO_BASE + (WND_X + POS_X) + ((WND_Y + POS_Y) * SCR_X) * 2;

/* Try this */
int ADDR = VIDEO_BASE + (WND_X + POS_X) + (((WND_Y + POS_Y) * SCR_X) << 1);

The only real caveat here is if the left shift causes a register overflow.


The only real caveat here is if the left shift causes a register overflow.

I do seem to remember something of this, in a piece I was doing not realted to video, but time_t in 32 bit.


You can always test the high bit of the expression to verify that it is a 0, in which case a shift will/should be ok. Usually, getting to this level of optimization is a case of "is it fast enough?". Code logically, and suscinctly first. Then, if it is too slow, look at optimizations. There are always trade-offs, especially in the domain of efficiency vs. simplicity/ease-of-maintenance. To quote myself:

"First, make it as simple/clear/reliable as possible, then make it fast, but ONLY if you need to!"...

After 30+ years of serious coding, this is advice I give myself every day! :-) IE, keep it lean, and mean, and then worry about performance. Usually, the former takes care of the latter.


keep it lean, and mean, and then worry about performance. Usually, the former takes care of the latter.

Wise words indeed and the only layer I would add on top of that is to test each snippet with all or at least a reasonable amount of possibilities to make sure it will works properly. Takes a bunch of time initially, but sure does pay off when one starts integrating a few hundred modules.

With the project I'm undertaking, I like to focus my attention on two things. x86 archetecture and intel instruction set. I was hoping to download for windows a simple 64 bit compiler analogous too FASMW of 143k. I guess I'll cross that bridge when I get to it.

This article has been dead for over six months. Start a new discussion instead.
Take the time to help us to help you. Please be thoughtful and detailed and be sure to adhere to our posting rules.