954,479 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Why is there so much assembly?

I made a simple assembly program about a year ago and I lost the source, but I still have the .exe file. I wanted to see the assembly code, so I disassembled the program with ollydbg and there was MUCH MORE code. There was about 3000-4000 lines of code, but the original program had less the 100! Why?

sergent
Posting Pro
598 posts since Apr 2011
Reputation Points: 70
Solved Threads: 22
 

the dis-assembler does not produce the same highlevel source code you had before you ran the assembly language compiler
it produces an intermediate opcode equal to each byte sequence within the exe
the traditional hello world program, is

;"hello, world" in assembly language for BeOS
;
;nasm -f elf hello.asm
;ld -s -o hello hello.o

section	.text
    global _start			;must be declared for linker (ld)

_syscall:			;system call
	int	0x25
	ret

_start:				;tell linker entry point
	push	dword len	;message length
	push	dword msg	;message to write
	push	dword 1		;file descriptor (stdout)
	mov	eax,0x3		;system call number (sys_write)
	call	_syscall	;call kernel
	add	esp,12		;clean stack (3 * 4)

	push	dword 0		;exit code
	mov	eax,0x3f	;system call number (sys_exit)
	call	_syscall	;call kernel
				;no need to clean stack

section	.data

msg	db	"Hello, world!",0xa	;our dear string
len	equ	$ - msg			;length of our dear string


compiles to 16 bytes, and decompiles to ~1K, without comments, every byte in a dword gets a separate push mov call sequence, every letter in the text

almostbob
Posting Sensei
3,148 posts since Jan 2009
Reputation Points: 571
Solved Threads: 376
 

Is there any decompiler that decompiles to the beginning size (or at least closer to it)?

sergent
Posting Pro
598 posts since Apr 2011
Reputation Points: 70
Solved Threads: 22
 

nope
they all produce pretty much the output of the DOS debugger,
the simplest byte level opcodes
the statements from a higherlevel source are lost, thats why assembler is so efficient
the Hello world in C or Pascal is very much larger, contains much of the source

almostbob
Posting Sensei
3,148 posts since Jan 2009
Reputation Points: 571
Solved Threads: 376
 

Ok thanks

sergent
Posting Pro
598 posts since Apr 2011
Reputation Points: 70
Solved Threads: 22
 

> every byte in a dword gets a separate push mov call sequence, every letter in the text

I am afraid you misinterpret the disassembly. Can you post the disassembler output?

nezachem
Posting Shark
903 posts since Dec 2009
Reputation Points: 719
Solved Threads: 194
 

If you wrote a program in Assembly, then when you pass your code through the assembler you get a "one to one" translation (what you wrote in Assembly is what will be in the exe file), now if on the other hand, you thought you wrote in Assembly and passed your code through a compiler, then yes you will have more code in the debugger/disassembler.

Also, if you link to static libs in your program than yes, you will seem to have more code then you explicitly wrote since that code is added during the assembly process. The assembler you used could also add padding to your sections, procs, etc for alignment purposes, replace some mnemonics for optimization, you might be seeing relocations not sure without seeing your disassembly... but in general, what you write in assembly is what you will see in the debugger... (Not sure about 16 bit asm)

GunnerInc
xor eax, eax
Team Colleague
79 posts since Jan 2011
Reputation Points: 38
Solved Threads: 13
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You