I made a simple assembly program about a year ago and I lost the source, but I still have the .exe file. I wanted to see the assembly code, so I disassembled the program with ollydbg and there was MUCH MORE code. There was about 3000-4000 lines of code, but the original program had less the 100! Why?

the dis-assembler does not produce the same highlevel source code you had before you ran the assembly language compiler
it produces an intermediate opcode equal to each byte sequence within the exe
the traditional hello world program, is

;"hello, world" in assembly language for BeOS
;
;nasm -f elf hello.asm
;ld -s -o hello hello.o

section	.text
    global _start			;must be declared for linker (ld)

_syscall:			;system call
	int	0x25
	ret

_start:				;tell linker entry point
	push	dword len	;message length
	push	dword msg	;message to write
	push	dword 1		;file descriptor (stdout)
	mov	eax,0x3		;system call number (sys_write)
	call	_syscall	;call kernel
	add	esp,12		;clean stack (3 * 4)

	push	dword 0		;exit code
	mov	eax,0x3f	;system call number (sys_exit)
	call	_syscall	;call kernel
				;no need to clean stack

section	.data

msg	db	"Hello, world!",0xa	;our dear string
len	equ	$ - msg			;length of our dear string

compiles to 16 bytes, and decompiles to ~1K, without comments, every byte in a dword gets a separate push mov call sequence, every letter in the text

Edited 5 Years Ago by almostbob: n/a

Comments
Helpful :)

Is there any decompiler that decompiles to the beginning size (or at least closer to it)?

nope
they all produce pretty much the output of the DOS debugger,
the simplest byte level opcodes
the statements from a higherlevel source are lost, thats why assembler is so efficient
the Hello world in C or Pascal is very much larger, contains much of the source

Edited 5 Years Ago by almostbob: n/a

> every byte in a dword gets a separate push mov call sequence, every letter in the text

I am afraid you misinterpret the disassembly. Can you post the disassembler output?

If you wrote a program in Assembly, then when you pass your code through the assembler you get a "one to one" translation (what you wrote in Assembly is what will be in the exe file), now if on the other hand, you thought you wrote in Assembly and passed your code through a compiler, then yes you will have more code in the debugger/disassembler.

Also, if you link to static libs in your program than yes, you will seem to have more code then you explicitly wrote since that code is added during the assembly process. The assembler you used could also add padding to your sections, procs, etc for alignment purposes, replace some mnemonics for optimization, you might be seeing relocations not sure without seeing your disassembly... but in general, what you write in assembly is what you will see in the debugger... (Not sure about 16 bit asm)

This question has already been answered. Start a new discussion instead.