Hi I want to write an x86 assembler function that multiplicates two 64 bit integers on a 32 bit processor. It has the following C signature:

void llmultiply(unsigned long long int l1, unsigned long long int l2, unsigned char *result);

The result of l1 * l2 are to be but in an array pointed to by *result.

After:

push ebp
mov ebp, esp

the stack looks like this:

| least significant byte of ebp                             | 0x3fffffe4  <---- ebp and stack pointer point here
|                 byte 1 of ebp                             | 0x3fffffe5
|                 byte 2 of ebp                             | 0x3fffffe6
|  most significant byte of ebp                             | 0x3fffffe7
| least significant byte of return address (byte 0)         | 0x3fffffe8  <--- the stack pointer points here after entering the function
|             byte 1 of return address                      | 0x3fffffe9
|             byte 2 of return address                      | 0x3fffffea
| most significant byte of return address (byte 3)          | 0x3fffffeb
| least significant byte of parameter 1 (byte 0 of l1)      | 0x3fffffec  ---
|                  byte 1 of l1                             | 0x3fffffed    |   here is A low
|                  byte 2 of l1                             | 0x3fffffee    |
|                  byte 3 of l1                             | 0x3fffffef  ---
|                  byte 4 of l1                             | 0x3ffffff0  ---
|                  byte 5 of l1                             | 0x3ffffff1    |   here is A high
|                  byte 6 of l1                             | 0x3ffffff2    |
| most significant byte of parameter 1 (byte 7 of l1)       | 0x3ffffff3  ---
| least significant byte of parameter 2 (byte 0 of l2)      | 0x3ffffff4  ---
|                  byte 1 of l2                             | 0x3ffffff5    |   here is B low
|                  byte 2 of l2                             | 0x3ffffff6    |
|                  byte 3 of l2                             | 0x3ffffff7  ---
|                  byte 4 of l2                             | 0x3ffffff8  ---   <---- [ebp + 20] points here
|                  byte 5 of l2                             | 0x3ffffff9    |   here is B high
|                  byte 6 of l2                             | 0x3ffffffa    |
| most significant byte of parameter 2 (byte 7 of l2)       | 0x3ffffffb  ---
| least significant byte of the address of the result array | 0x3ffffffc
|                 byte 1 of the address of the result array | 0x3ffffffd
|                 byte 2 of the address of the result array | 0x3ffffffe
|  most significant byte of the address of the result array | 0x3fffffff  <--- stack bottom

The multiplication can be done like this:
a * b = AH * BH * 2^64 + (AH * BL + AL * BH) * 2^32 + AL * BL.

Where H and L stands for high and low bits of a and b. The multiplication part is however not my biggest problem, I could solve that myself. What I'm having trouble with is putting the product in the result array so that it will be pointed to by the result pointer in:

void llmultiply(unsigned long long int l1, unsigned long long int l2, unsigned char *result);

So if someone could help me with that part atleast, some help on the multiplication part would also be welcome but not as important.

> 2^64
This is 8 bytes into your array, and the 2^32 is 4 bytes in to the array.

But isn't the overall result 128 bits in total?

Yes the result array is 128 bit, those 4 result array addresses will hold 32 bit of the product each.

So this is what I've got now:

SECTION .data

	SECTION .text
	ALIGN	16
	BITS	32

AL_OFF	EQU     8	; Offset from EBP to low  bits of a (AL)
AH_OFF	EQU     12	; Offset from EBP to high bits of a (AH)
BL_OFF	EQU     16	; Offset from EBP to low  bits of b (BL)
BH_OFF	EQU     20	; Offset from EBP to high bits of b (BH)
RES_OFF	EQU     24	; Offset from EBP to result array pointer
        
	GLOBAL llmultiply

llmultiply:
    PUSH EBP
    MOV EBP, ESP
		
    PUSH EAX
    PUSH EBX
    PUSH ECX
    PUSH EDX	
    PUSH EDI
    PUSH ESI

    MOV EAX, [EBP + AL_OFF]
    MOV EDX, [EBP + BL_OFF]
    MUL EDX			 ; AL * BL, EAX = low(AL * BL), EDX = high(AL * BL)

    MOV EDI, EAX	         ; Save low(AL * BL) in EDI
    MOV ESI, EDX                ; Save high(AL * BL) in ESI

    MOV EAX, [EBP + AH_OFF]
    MOV EDX, [EBP + BL_OFF]
    MUL EDX			   ; AH * BL, EAX = low(AH * BL), EDX = high(AH * BL)

    ADD ESI, EAX	           ; Add low(AH * BL) to high(AL * BL)
    ADC ECX, EDX               	; Add carry from the previous addition to high(AH * BL) and save in ECX

    MOV EAX, [EBP + AL_OFF]
    MOV EDX, [EBP + BH_OFF]
    MUL EDX                 	; AL * BH, EAX = low(AL * BH), EDX = high(AL * BH)

    ADD ESI, EAX             	; Add low(AL * BH) to high(AL * BL) and low(AH * BL) 
    ADC ECX, EDX             	; Add carry from the previous addition and high(AL * BH) to high(AH * BL) and earlier carry, save in ECX

    MOV EAX, [EBP + AH_OFF]
    MOV EDX, [EBP + BH_OFF]
    MUL EDX                 	; AH * BH, EAX = low(AH * BH), EDX = high(AH * BH)

    ADD EAX, ECX             	; Add high(AL * BH), high(AH * BL) and carry to low(AH * BH)
    ADC EDX, 0               	; Add carry from the previous addition to high(AH * BH)

    MOV EBX, [EBP + RES_OFF]
    MOV [EBX], EDI		; Put low 32 bit of result in array
    MOV [EBX + 4], ESI		; Put next 32 bit of result in array
    MOV [EBX + 8], EAX		; Put next 32 bit of result in array
    MOV [EBX + 12], EDX        ; Put high 32 bit of result in array

    POP ESI
    POP EDI
    POP EDX
    POP ECX
    POP EBX
    POP EAX
		
    POP EBP			; restore EBP reg
    RET				; return

It's almost working but there's a carry missing somewhere, for an example:

FFFEFFFFFFFFFFFF * FFFF0001FFFFFFFF should be: FFFE0002FFFDFFFE0001FFFE00000001

but I get:
FFFE0001FFFDFFFE0001FFFE00000001

So there should be another carry added to EDX.

This article has been dead for over six months. Start a new discussion instead.