Learning Raw Assembly

Question

Labdabeta 182 Posting Pro in Training

12 Years Ago

Hello, I only have a basic knowledge of ASM because every tutorial I have found (including Narue's) is based on using some kind of high level library or another. However when I look through disassemblies of pretty much any program I notice that all such library calls are gone. My questions are:

A) Did the assembler simply copy-paste the library code in (like a header file in c/c++)
B) Where can I learn general assembly?
C) I noticed that pretty much every tutorial uses different layout (IE: data:code: vs section data, etc) what are these and how can I know which one is correct?
D) I always hear that you need to specialize your assembly code to your exact operating system, but then how come I can run certain programs across all the systems? What sort of sorcery is going on?!

assembly operating-system

3 Contributors
13 Replies
559 Views
20 Hours Discussion Span
Latest Post 12 Years Ago Latest Post by Labdabeta

All 13 Replies

Schol-R-LEA 1,446 Commie Mutant Traitor

12 Years Ago

According to your personal profile, you are runing Windows 7, which means you are running an x86 processor, either a 32-bit system or (more likely, if it is running 7) a 64-bit 'long' mode system. There are several assemblers for Windows - Macro Assembler (MASM), Gnu Assembler (GAS), and Netwide Assembler (NASM) being the most common - so it would depend on which one you are using.

(Unless you are using an emulator such as SPIM or PEP-8, in which case you are looking at whatever assembler comes with the simulator. The same applied to pseudo-machines such as the JVM or the .NET Common Language Runtime.)

As for the diassemblies, if they are for a Windows native code, then there should in fact be library calls. In x86 assembly, these would take the form of either a CALL instruction for the standard libraries, or if you are invoking a Windows system call directly, a SYSENTER or a task gate (IIRC). However, there is no easy way to determine which function call is going to which external routine in a disassembly, since the source symbol tables aren't available. Disassmeblers can only infer so much without the symbols.

OTOH, if they are managed CLR code (which is what you'd get from, say, C# disassemblies), then you aren't really looking at native code at all - the 'assembly language' is that of the Microsoft Common Language Runtime, a simulated virtual machine used to simplify the compilation process and make programs more 'secure' (supposedly). These will look very different from true x86 assembly code, though liek with x86 assembly, the CALL instruction is the one used to call routines, both ith a program and in the libraries.

So the question remains: which assembler are you using?

Edited 12 Years Ago by Schol-R-LEA

Schol-R-LEA 1,446 Commie Mutant Traitor

12 Years Ago

Did the assembler simply copy-paste the library code in (like a header file in c/c++)

If it was a statically linked library, yes.

Er, not exactly. The library code is not in the form of source code, but rather is a binary file containing the compiled or assembled code in what is known an 'object format' (which has nothing to with object-oriented programming). The object format (which under Windows is one called PE) contains machine code, except that the references to specific addresses are stubbed out, as well as the symbol information needed by the linker and loader to patch the addresses into place once the program is run.

When the program is linked to the library (a step which most modern IDEs do automatically, but which historically was a completely separate step from compiling), the object code sections for the specific library routines are extracted and patched into the program to form the executable file. The linker resolves the calls in the program to match the linked-in library code.

Now, the executable file is still not pure machine code, because the specific addresses for jumps and so forth still haven't been resolved; doing that is the job of the loader, which is the part of the operating system that loads the program into memory. The addresses cannot be resolved until run time, because the system can (and will) load the code into different locations in memory, even with each process getting it's own virtual memory space, because the process may have to map different dynamic link libraries in which might be in different locations in the memory map on different process runs.

If you want to know more about this, see the online text for Linkers and Loaders by John R. Levine.

Edited 12 Years Ago by Schol-R-LEA

Schol-R-LEA 1,446 Commie Mutant Traitor

12 Years Ago

Deceptikon: Point taken.

Labdabeta: I personally prefer NASM, but the one that comes with the GNU Compiler Collection (and hence is the default for Code::Blocks and FSF packages in general) is GAS. The problem with GAS is that it uses a notation based on the older AT&T UNIX assemblers, and which is radically different from the Intel-style assemblers such as MASM and NASM. A good overview of the diferences between the two syntaces can be found on the OSDev wiki. While it isn't too difficult to get used to AT&T style, most Windows programmers are more familiar with the Intel format.

Edited 12 Years Ago by Schol-R-LEA

Schol-R-LEA 1,446 Commie Mutant Traitor

12 Years Ago

NASM probably is your best bet, then; it shouldn't be difficult to get it to work with Code::Blocks. You can get the Windows installer from here.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 1 · 2013-05-16T17:23:11+00:00

However when I look through disassemblies of pretty much any program I notice that all such library calls are gone.

Those calls are hidden in the disassembly because you're looking at direct jumps into the libraries rather than the calls an assembly programmer would write.

A) Did the assembler simply copy-paste the library code in (like a header file in c/c++)

If it was a statically linked library, yes.

B) Where can I learn general assembly?

I think you're asking the wrong question. You seem to want to know how to write bare metal assembly, where you use OS interrupts to handle the lowest level of I/O and system calls rather than depending on existing libraries (such as used by Narue's tutorial). Personally, I think that's kind of dumb for the same reason you'd use the available printf() in C instead of writing it yourself.

As far as general assembly, any book on assembly language should cover what you need. Just go to the bookstore and flip through them to see which one you mesh with best.

C) I noticed that pretty much every tutorial uses different layout (IE: data:code: vs section data, etc) what are these and how can I know which one is correct?

Every assembly language dialect has a different syntax. That has nothing to do with assembly and everything to do with your particular assembler. Just read the documentation to get a feel for how it works and best practices.

D) I always hear that you need to specialize your assembly code to your exact operating system, but then how come I can run certain programs across all the systems? What sort of sorcery is going on?!

You conform your assembly code to the processor architecture. You can optimize it with knowledge of the operating system, and obviously any system calls depend on the OS. I'd wager that these portable programs don't use any non-portable system calls and the systems you're using all have the same processor architecture.

As an example, if I write an x86 program in assembly, it's not going to work on a 68000 processor. ;)

Labdabeta 182 Posting Pro in Training Featured Poster · Answer 2 · 2013-05-16T17:54:26+00:00

So it seems that what computer you are using and what assembler you pick determines what dialect you will be writing in... if that is the case how do I find a tutorial for MY computer and MY assembler?

Labdabeta 182 Posting Pro in Training Featured Poster · Answer 3 · 2013-05-16T18:46:10+00:00

I have not yet decided... but I would like one that will work well with code::blocks. It already does ASM highlighting, it can't be too hard to find an assembler that will work with it. What do you suggest? All I need is something that can be used at run-time (not a hard thing to ask) as I have already made my own pseudo-compilers from certain esoteric langages to c/c++ and C::B compiles them just fine.

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 4 · 2013-05-16T18:56:43+00:00

What do you suggest?

NASM or FASM. Those are my favorites, though I've been known to enjoy the novelty of RosASM despite my personal feelings toward Betov and HLA despite its decidedly high level taste.

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 5 · 2013-05-16T19:14:26+00:00

Er, not exactly.

True, but I'd question your sanity if you thought my brief confirmation of the general concept was anything remotely close to an "exact" description of what typically happens. ;)

Labdabeta 182 Posting Pro in Training Featured Poster · Answer 6 · 2013-05-16T20:11:18+00:00

It seems like writing a program to convert from Intel to AT&T and back again wouldn't be very hard. Is it possible to get an Intel style assembler to work with Code::Blocks. And if so where could I download it? All I want to be able to do is run this: someAssembler.exe mySourceFile.asm myOutputFile.exe. As of right now I am using an odd... thing... It converts asm code to c code then compiles it... IMO it sorta defeats the purpose of assembly, no? Anyways a simple executable that works on simple command line arguments would be perfect.

Labdabeta 182 Posting Pro in Training Featured Poster · Answer 7 · 2013-05-17T12:11:37+00:00

Wow, NASM seems perfect. One more issue, currently all that I know about assembly I learnt from http://www.intelligent-systems.info/classes/ee360/tutorial.htm . I am wondering if there is a decent debugger that could (maybe?) work with the code::blocks debugger. I understand that that is unlikely, so really any good debugger would be awesome. Thanks :)

deceptikon 1,790 Code Sniper Team Colleague Featured Poster · Answer 8 · 2013-05-17T13:28:16+00:00

For assembly I've always preferred OllyDbg. However, I've never run it through an IDE, always separately in its own UI. In fact, I've never used an IDE for assembly code that wasn't embedded in C or C++... ;p

Labdabeta 182 Posting Pro in Training Featured Poster · Answer 9 · 2013-05-17T13:37:09+00:00

Ok... it turns out code::blocks kinda sucks with stand-alone assemblers. It always wants them to shove the assembly code into a library rather than just execute it. So I guess I will write my code in the editor (for syntax highlighting, etc) then assemble it on its own. OllyDbg looks decent, but I really like MSCodeView's interface. Is there any way to use it with NASM instead of MASM?

Learning Raw Assembly

Recommended Answers Collapse Answers

All 13 Replies

Recommended Answers