Hello,

I am learning ia-32 assembly. I have observed that when creating an object file using gcc from the simple c program:

main () {

    char buf[256];
    write(1, buf, 256);
}

it generates the following instruction for the write: (using the data on top of the stack for the args)

call write

I get similar results for read() as well. What I am wondering is, is this a system call for write, where the linker replaces the "write" instruction for syscall with the appropriate syscall number, or is this something that is linked to the c library?

Essentially what I would like to know is how do I find out what exactly the program is doing. If it is linking to the c library, how do I find out what the code looks like that it is linking to? If it is just a syscall, how does it determine the appropriate syscall number for the kernel?

Thanks,

Allasso

Recommended Answers

All 7 Replies

there is no such thing as a syscall number. Linking assembly program to c library is the same as linking c-only programs. When you wrote a pure c program you don't call functions by some number but by a function name. Same with your assembly program. Just link with the appropriate c libraries and you are done.

Of course if you don't want to use the c libraries then your task is much more difficult. You will have to write all that code yourself. If you use the flat memory model and a 32-bit assembler then you can call win32 api functions instead of those 16-bit MS-DOS int instructions.

Thank you kindly for the reply.

"there is no such thing as a syscall number."

When I say syscall number, I am referring to the system call number. Please see:

http://www.gnu.org/s/libc/manual/html_node/System-Calls.html

On some systems these numbers are displayed explicitly in the syscall.h file in the macro definition, for example:

#define SYS_write 4

"4" is the system call number for write (on this machine).

for example,

pushl   $256        # number of bytes to write 
pushl   %eax        # pointer to buffer
pushl   $1          # write to STDOUT
call    write

can be be done with

pushl   $256        # number of bytes to write 
pushl   %eax        # pointer to buffer
pushl   $1          # write to STDOUT
pushl   $4          # system call number for write
call    syscall

In C syscall can be used explicitly (example from aforementioned page):

rc = syscall(SYS_chmod, "/etc/passwd", 0444);

This is using the macro SYS_chmod, but the system call number for your system can also be used in place of it.

The aforementioned page also says that:

"(GNU C Library) functions work by making system calls themselves."

Which answers one of the questions in my original post.

My question has to do with how these functions are doing what they are doing. Simply using syscall with the call number for a particular machine would not be portable (obviously). How do they find out how the kernel has defined a particular system call? Is this written in at compile time, or is this done at run time? etc. I have experimented with reverse engineering using objdump, but I am still not seeing it.

I am using Unix systems, so the MS advice isn't much help to me, but thank you.

I am not necessarily interested in writing assembly programs to do what these functions are doing, but this is how I learn. If I can mimic these functions in my assembly code, I will be much farther ahead in my education. I am interested in learning what is happening at a deeper level. I may not be asking the right questions, but I am trying to find my way through it.

Regarding "syscall number", I thought I had heard that term used before, but I may be incorrect. What I meant was "system call number".

I can't help you because I never used that function. Since it's a C function then your assembly program calls it just like any other C function. You have to be intimately familar with the operating system for which you are programming.

It is very important to understand that an application and the kernel reside in different address spaces. It means, among other things, that the application possibly cannot call anything in the kernel; the only way to "call" it is through the syscall mechanism, which essentially is the interrupt. This answers your first question: anything the application calls is linked in the libc.

The implementation of some functions, say read, or write, contains a syscall. Their job essentially is to provide a call number (which is hardcoded into the function), and call syscall(). Again, since the kernel cannot easily access the application stack (as it is in a different address space), the job of syscall() is to arrange its arguments in registers, and perform the interrupt.

> Simply using syscall with the call number for a particular machine would not be portable (obviously)

On the contrary, it is portable. The syscall numbers never ever change. The particular kernel version may not support a particular syscall, but the same number always refer to the same syscall.

It is very important to understand that an application and the kernel reside in different address spaces. It means, among other things, that the application possibly cannot call anything in the kernel; the only way to "call" it is through the syscall mechanism, which essentially is the interrupt. This answers your first question: anything the application calls is linked in the libc.

The implementation of some functions, say read, or write, contains a syscall. Their job essentially is to provide a call number (which is hardcoded into the function), and call syscall(). Again, since the kernel cannot easily access the application stack (as it is in a different address space), the job of syscall() is to arrange its arguments in registers, and perform the interrupt.

> Simply using syscall with the call number for a particular machine would not be portable (obviously)

On the contrary, it is portable. The syscall numbers never ever change. The particular kernel version may not support a particular syscall, but the same number always refer to the same syscall.

Hmmm, your reply is interesting. So you are saying that for any IA32 machine, you could use the "call syscall" instruction in an assembly program and not link to the C library at all (for that particular instruction, and assuming, as you say, the syscall is supported by the kernel) ? It makes me wonder, why link to the C library, since using "call syscall" would be so straightforward.

I have been going around and around with this, thinking that libc provided code that had to somehow determine the correct syscall number to use for the system the program was being run on.

There are some reasons not to do this. First and foremost is cross-platform portability. A syscall is a unix artefact, while read/write/whatever are supported on any posix platform (and even on some others). Second, the syscall is prototyped as vararg, which prevents the compiler from error checking.

Calling it directly doesn't look too straightforward to me; elimination of the proxy call gives an infinitesimal advantage time wise, and you still need to link libc for all the other goodies it provides.

Edit: since you are interested in the assembly, disregard my prototyping argument.

"...and you still need to link libc for all the other goodies it provides"

for example? What goodies?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.