Hey. Thanks for the reply. I think I understand the issue now - it's definitely relocation. Here's some good info on the subject (stolen off a newsgroup - this was a post by Jack Klein):
The brief overview is this:
The part of the exe format that contains the image of the code is put
together with the assumption that the executable will be loaded at
into memory at 0000:0000 (segment 0).
Let's assume for a moment that the object files for your program
contain two code segments of 4K bytes each, and one data segment of 4K
bytes, and the two code segments will be first in the image.
The very first line of the first code segment is a call to a
subroutine that starts on the very first line of the second code
segment, that is file1.asm contains this:
extern func2:far
start:
call func2
...and file2.asm contains this:
public func2
func2 proc far
mov ax, my_data_segment
mov ds, ax
Now if the executable was actually loaded into memory at 0000:0000,
the code for file1 would start at 0000:0000 and end at 0000:0FFF, the
code for file2 (and the address of func2) would be 0100:0000 and end
at 0100:0FFF, and the data segment would start at 0200:0000.
So in the code image part of the executable file, it uses those
segment values:
call func2 9A00000001
^^^^
this 16 bit words contains segment 0100
mov ax, my_data_segment B80002
^^^^
this 16 bit word contains segment 0200
Now we know that the code will not really be loaded at segment 0000.
It might be loaded, for example, at segment 4000. That means that the
call to func2 needs to become:
9A00000041 (call func2 at 4100:0000)
...and the load of the data segment value needs to become:
B80042 (mov AX, 4200)
In fact, every 16 bit value in the program image that represents a
numerical segment can be adjusted for where the program is loaded by
adding the load segment number to the value in the image.
So the relocation header at the front of the contains the offset in
the executable image of every single 16 bit word that represents the
numerical value of a segment. After loading the code image part of
the exe file into memory (starting at some segment), the loader which
is part of command.com reads the relocation table entries. For every
segment reference entry it just adds the actual load segment value to
the relative segment value already in the word. So all of the segment
fix-ups are done before the program starts running.
As for how high-level languages work, those for DOS and/or 16 bit
Windows provide different memory models, these are the common names
although some compilers might use slight variations:
small (code and data each limited to 64K, all calls are near, all
pointers are near)
compact (multiple code segments can each be up to 64K so total code
can be much larger than 64K, all calls are far, all data pointers are
near)
data (single code segment limited to 64K, multiple data segments of up
to 64K each, so total data can be more than 64K. All calls are near,
data pointers are far)
large (multiple code segments, multiple data segments, both code and
data can be larger than 64K, calls are far and data pointers are far)
Then the compiler comes with multiple versions of the library, at
least one for each of the memory models. Often the IDE or command
line switch to the compiler for a memory model will tell it which
specific version of the library to ask the linker for.
Generally the compiler breaks code and data into segments by source
code files, that is each source code file creates one code segment for
the function(s) it defines and one data segment for whatever data it
defines. Often there are extended keywords to override this and
customize the memory usage in more detail.
Of course on more modern processors (including the 386 and up) and
operating systems (including Windows, Linux, etc.), segments are gone
from the application program level. All programs are written to run
in a flat 32 bit address space. The operating system uses the memory
management hardware of the processor to map each program's address
space to specific hardware memory addresses without needing to fix up
segment values.
-p