Dont know, if chosen a good forum (--assembly), but i need help guys.
Just started to look at disassembling and havent found any good tutorials or guides or helps.
So I run a program and get eg. this: (only some line, the whole file is bigger)

.text:00401000                 push    ebp
.text:00401001                 mov     ebp, esp
.text:00401003                 sub     esp, 8
.text:00401006                 mov     eax, ds:atexit
.text:0040100B                 leave
.text:0040100C                 jmp     eax
.text:0040100C sub_401000      endp

i only know jmp(jump) nad all jz,jnz and other jump like things ... now what are the other things??? what does push, mov sub leave mean??? what value has ebp, esp, eax????

Recommended Answers

All 3 Replies

The MOV instruction is for moving (copying, actually) a value from one memory location to another (or from memory to a fast register, which is sort of special kind of memory for performing operations in quickly and for holding special values the system needs to keep track of, or vice versa). The first argument is the destination, while the second is the source. So, for example, the line mov ebp, esp copies the value in the ESP register to the EBP register. Similarly, the statement mov eax, ds:atexit copies the value in the memory location labelled atexit in the data segment (ds) to the EAX register.

The SUB instruction subtracts the value in the second argument from the first, and puts the result in the first argument location. Thus, sub esp, 8 reduces the value in the ESP register by 8.

The PUSH instruction will take a bit more explanation. As I've mentioned, EBP, ESP, and EAX are registers; specifically, the 32-bit forms of the base (or frame) pointer, the stack pointer, and the accumulator (general-purpose) register. The value in EAX will depend on the program, but EBP should hold a pointer to a location in the middle of the system stack (more on this in a moment), while ESP holds the current top of the stack.

So, what is the stack, anyway? It is a region of memory which is set aside for temporary values, which are organized as a last-in, first-out data structure, conceptually resembling a stack of dishes or some other small, easily piled items. The top of the stack (actually the lowest memory in the stack, usually - stacks on PCs usually grow downward, for historical reasons - but I digress) is the memory address where the most recent item has been put. When you add something to the stack - something referred to as pushing the item onto the stack - the stack pointer gets incremented, or added one to, to point to the next address, and the item is copied to the newly-allocated stack location (actually, with the upside-down stacks usually used, it is usually decremented - subtracted by one - but again that's besides the point). When an item is removed - popped off of the stack - the stack pointer is simply decremented (or incremented), and the used memory is simply left as it is, ready to be reused.

That's the basic idea, anyway. There are various complications - the actual word size, for example, is usually 4 bytes rather than 1, so a push actually subtracts 4 from the stack pointer - but the basic idea is that you simply push and pop words onto and off of the stack.

Why is this so important that it has special hardware support? Because when combined with a base pointer, it makes for an easy way to hold a group of local variables for a function, something called an activation record or stack frame. Basically, the idea is that you push the current base pointer onto the stack, then copy the current stack pointer into the base pointer. This now becomes the base address of the activation record (hence the name 'base pointer'). Then the stack is incremented (or decremented, as you can see in the example code) enough to hold all the variables that the function uses. To access the local values, you would use an offset from the base pointer, which would point to the location to use for it. When the function exits, you then copy the base pointer back to the stack pointer, and pop off the old base pointer, restoring the state of the previous function.

It is a complicated dance, and one which confuses almost everyone when they first encounter it, so don't be surprised of this is a bit over your head right now.

To understand the LEAVE operation, you need to know about it's counterpart, ENTER. This page does a better job of explaining it than I probably could. To sum it up, ENTER creates the activation record, while LEAVE cleans it up.

I hope that I've made at least some sense to you; the truth is you really can't learn about disassembling without learning assembly language first. I would recommend referring to the x86 instruction set listing on Wikipedia as a starting point; a good book on the subject, such as Assembly Language Step by Step, would be a great asset as well.

Thank you that helped me very much. One more question: Is it true that on x86_64 (amd64) architectures ESP = RSP, EBP = RBP and so on????

Not exacty. What the Rxx registers are are the 64-bit versions of those registers, which supercede - and incorporate - the 32-bit Exx registers. By this I mean that the register EAX, for example, still exists, but it is now a sub-section of the RAX register, just as the 16-bit AX register is a part of the EAX regsiter (and the RAX as well). The lower half of the RAX register is the EAX register, while the lower half of EAX is AX; the two halves of AX are in turn AH and AL. This posting goes into additional detail, and may help you understand things better.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.