Hi.
I have some trouble understanding the linking process and hence this related question.
Lets say I need a global variable that is needed by every single source file in my project.
I would create a constants.h header and put extern const int var; in it. In corresponding source, constants.cpp, I would put const int var = 5; .(oh and a question here, wouldn't just int var = 5; work?)

Then, every source file would get var. (supposedly)
Will source files, that don't #include "constants.h" get access to var?
If so, how would it possible if there is no link between the 2 files?

Edited 4 Years Ago by theguitarist

int and const int are not the same thing -- int is a variable that can change a runtime from one value to another, while const int may not even be given storage because the compiler may just treat it as if it were a macro by inserting its value wherever it is used.

Each source file is compiled separately, so only the file that has the included header will have "var". And only one file can have the header file included.

while const int may not even be given storage because the compiler may just treat it as if it were a macro by inserting its value wherever it is used.

No, it can't be treated as a macro (substituting for a literal constant), because the const-variable must have an address in memory, i.e., you need to be able to write const int* p = &var; and get a valid address out of that. Literal constants are prvalues, and thus, don't have an address in memory, while const-variables are non-modifiable lvalues, they're different.

In corresponding source, "constants.cpp", I would put const int var = 5;.

This is wrong. In the "constants.cpp", you need to write:

extern const int var = 5;

This is because, by default, a const-variable has internal linkage (i.e., not accessible from other translation units). This means that writing const int var = 5; is equivalent to writing static const int var = 5;, which is inconsistent with the prior declaration (in "constants.h") as an extern.

Then, every source file would get var. (supposedly)
Will source files, that don't #include "constants.h" get access to var?

Every source file that has an #include "constants.h" will have access to that one and unique instance of the const-variable var. In other words, if you print the address of var from each source file, you will get the same value.

If so, how would it possible if there is no link between the 2 files?

There is a link, made by the linker. Declaring the const-variable as "extern" gives it external linkage meaning that the symbol will be exported in the compiled object file (from "constants.cpp") and that all other source files making use of that const-variable will be linked with that exported symbol.

while const int may not even be given storage because the compiler may just treat it as if it were a macro by inserting its value wherever it is used.

Yes, an object of type const int need not be stored anywhere.

But then, the same is also true of any object, even modifiable objects.

If we have int m = 23 ; m is an lvalue, and its adress can be taken: int* p = &m ; p will not be equal to the nullptr and p will not compare equal to a pointer to any object other than m; *p will alias m. All these are requirements of a conforming C++ implementatrion. That m should be actually stored somewhere is not.

The semantic descriptions in this International Standard defne a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine.
Foot note: This provision is sometimes called the 'as-if' rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program....

Consequently:

Under the 'as-if' rule an implementation is allowed to store two objects at the same machine address or not store an object at all if the program cannot observe the difference.

int foo() // const int // generated code: { return -2 ; }
{
    const int a = 7 ; // this implementation does not store a
    const int b = 9 ;

    const int* pa = &a ;
    const int* pb = &b ;

    if( pa && pb && pa != pb ) return *pa == 999 ? *pa + *pb : *pa - *pb ;
    else return a*b ;

    /* g++ 4.8 with -std=c++11 -pedantic -Wall -Werror -O3 -fomit-frame-pointer -c -S

    __Z3foov:
        movl    $-2, %eax
        ret

    */
}

int bar() // (modifiable) int // generated code: { return 0 ; }
{
     int a = 7 ; // this implementation does not store a
     int b = 9 ;

     int* pa = &a ;
     int* pb = &b ;

     if( pa != nullptr ) *pa = b ;

    if( pa && pb && pa != pb ) return *pa == 999 ? *pa + *pb : *pa - *pb ;
    else return a*b ;

    /* g++ 4.8 with -std=c++11 -pedantic -Wall -Werror -O3 -fomit-frame-pointer -c -S

    __Z3barv:
        xorl    %eax, %eax
        ret

    */
}

int baz() // array of (modifiable) int // generated code: { return 7 ; }
{
    int array[5] = { 0, 1, 2, 3, 4 } ; // this implementation does not store the array
    int* begin = array ;
    int* end = array + 5 ;
    int* mid = array + 2 ;
    return end - begin + *mid ;

    /* g++ 4.8 with -std=c++11 -pedantic -Wall -Werror -O3 -fomit-frame-pointer -c -S

    __Z3bazv:
        movl    $7, %eax
        ret

    */
}

int function( int&& ) ;

int foobar() // literal constant
{
    return function( 7 ) ; // this implementation stores the literal constant 7

    /* g++ 4.8 with -std=c++11 -pedantic -Wall -Werror -O3 -fomit-frame-pointer -c -S

    __Z6foobarv:
        subl    $44, %esp
        leal    28(%esp), %eax
        movl    %eax, (%esp)
        movl    $7, 28(%esp)
        call    __Z8functionOi
        addl    $44, %esp
        ret

    */
}

@mike_2000_17,
Thanks for your reply.

This is wrong. In the "constants.cpp", you need to write:
extern const int var = 5;
This is because, by default, a const-variable has internal linkage (i.e., not accessible from other translation units). This means that writing const int var = 5; is equivalent to writing static const int var = 5;, which is inconsistent with the prior declaration

Please correct me where I'm wrong.
Every source file compiled is a translation unit.
When I create constants.cpp and include constants.h in it, they both together form a translation unit.
Lets say I didn't use extern. It becomes static by default.
So when I now include constants.h in my several other .cpps,.. will copies of the same variable(stored in separate addresses) be made? Is this what internal linkage is?
Is this what you meant by

(i.e., not accessible from other translation units)

Given that a constant variable can't and won't be changed in any source file, I don't see why static const would ever come into use. Isn't it a waste of memory?
Why is it by default static for a const variable, by design ,then?

And for other non-const variables, it is extern by default, right?

Why must the extern keyword be mentioned twice, once in the header and once in the source? Won't the source suffice? As long as the compiler can find out the existence of that variable, how will it matter to it, whether it is extern or not? After all, the linking happens later, and the linker will find out it is extern, from the source.

sorry for so many questions.

Please correct me where I'm wrong.

OK

Every source file compiled is a translation unit.

That's correct. Everything that turns into one object file (.o or .obj) is a translation unit. The compiler only looks at one translation unit at a time, the linker then assembles them.

When I create constants.cpp and include constants.h in it, they both together form a translation unit.

Pretty much. The cpp file that you compile is the translation unit. As for the headers, you have to think of the include-statements as just a "insert file here" mechanism that grabs all the content of the header file and inserts it in place of the include-statement. So, the cpp file and all the headers it includes turn into one massive source file, and that is the translation unit.

Lets say I didn't use extern. It becomes static by default.

For const-variables, yes.

So when I now include constants.h in my several other .cpps,.. will copies of the same variable(stored in separate addresses) be made? Is this what internal linkage is?

Yes. That's what internal linkage means. It means that all the parts of the code in that translation unit will refer to (be linked with) an internal copy of the variable which only exists in that translation unit, and which is not visible to the linker when linking the object files (compiled translation units).

Given that a constant variable can't and won't be changed in any source file,

It cannot be changed at run-time. But you could change it in the source code. If you use the extern mechanism for that const-variable, you could change its value as defined in the cpp file where it appears, and all you need to recompile is that cpp file, all the others simply need to be re-linked to that new object file. And that's one use of this extern mechanism.

I don't see why static const would ever come into use. Isn't it a waste of memory?

Yes it is a small waste of memory, but it has other advantages. For a static variable, the compiler has access to all the source code that might possibly be using that variable (since it cannot be used outside the translation unit), this allows the compiler to possibly do some optimizations, possibly as far as removing the variable from memory completely, as shown by vijayan121. And even if it is not optimized away, it can still be more efficient due to locality of reference (when the memory you refer to is more localized, you can reduce cache misses and other time-consuming memory fetching operations).

Why is it by default static for a const variable, by design ,then?

Partially for the reasons given above. Very often, const variables will never change in value, so the advantage of being able to change its value only in the cpp file where it is defined does not apply very often. Most of the time, const variables are integers or other primitive types, meaning they don't take up much memory. Also, to be able to use a const variable as the size of a static array (C-style array), it's definition (value) must appear in the same translation unit. In other words, the typical uses of const variables is in line with them being static (internal linkage), and hence, that's the default.

And for other non-const variables, it is extern by default, right?

Yes. Unless they appear in an anonymous namespace.

Why must the extern keyword be mentioned twice, once in the header and once in the source?

The extern appearing in the header instructs any code that uses that header that this variable is to be found via the linker, otherwise, the compiler will look for it in the current translation unit, and won't find it. The extern appearing in the source file that defines the variable is there to match the declaration in the header (just like you need function prototypes to match between declaration and definition, it's the same with variables). When the compiler does not find the definition of the extern variable in the translation unit, it will mark the symbol for the linker to resolve it, but if the compiler does find the definition, then it will export that symbol for the linker to find it. Both extern keywords are necessary. The only reason why the static or extern keywords can sometimes be missing from one or the other (decl. or def.) is when the keyword in question matches the default attribute.

As long as the compiler can find out the existence of that variable, how will it matter to it, whether it is extern or not?

Because the compiler must generate compiled code (binaries) from the source code. If the variable has internal linkage, the compiler can deal with it as it is compiled. In other words, it can replace codes that refers to that variable with binary code that accesses the actual variable and its value (or optimize away that code, if the variable is const). If the variable is external, then the compiler must replace the codes that refer to the variable with some markers for the linker to know that it must come in to substitute in the correct code to refer to the actual variable (found in another TU). So, yes, it matters to the compiler whether the variable is extern or not.

After all, the linking happens later, and the linker will find out it is extern, from the source.

The linker isn't that smart, and it doesn't go back and compile anything. All the linker really does is collect the set of symbols exported by the different object files and libraries into one big list of symbols, and then, it goes through the list of used symbols in the object files (i.e., that is a list generated by the compiler listing all symbols needed (called or referred to) by the code and where those uses appear), and it resolves all those symbols to the ones found in its big list of symbols. It doesn't do any kind of code analysis or compilation, purely a job of connecting-the-dots. Remember, the "linking happens after", after the compilation, meaning that it just operates on binary code, which means that at that point all compilation work has been done and most of the code-information from the source files has disappeared. The linker is just connecting dots, it probably isn't even aware of the types of the functions or variables that it is linking together. Often, we call such a linker (as used in C and C++) as a "stupid linker" model (in contrast to some other compiled languages that use much smarter linker models, where the linker isn't completely oblivious to what it is doing and can actually do some additional compilations).

Thank You mike, for that elaborate post.
I think I understood what you said.
Now here is one issue I am facing in a project.
There is one constants.h file and its constants.cpp.
I have put all my constants there.
THere is one variable called BOT_TIME that varies with the difficulty of my game, and hence isn't const. THere are many files that use it.
1) In constants.h I declare it extern int BOT_TIME.
In constants.cpp, I delcare it extern int BOT_TIME.
BUILD => undefined references to the variable in all sources(Yes, I've included the header).

2) In constants.h I declare it int BOT_TIME.
In constants.cpp, I delcare it int BOT_TIME.
Since non-consts are by default extern , I decided to leave that keyword.
BUILD => Multiple definition of the variable (shows in each source file that has constants.h included)

3) What worked ?
In constants.h I declare it extern int BOT_TIME.
In constants.cpp, I delcare it int BOT_TIME.

Where is the issue?

Oh initializing the variable in constants.cpp makes it work for cases 1 and 3.
What is happening?

Okay.
It turns out that, extern int x is a declaration and int x is a definition.
And case 1 is not right because there is no definition of x, made. And case 2 is wrong because there are 2 definitions when only one is allowed. Case 3 is right.
Initiliazing x in constants.cpp extern int x = 5; makes that a definition and hence that works too.

Thank you all.

Glad you figured it out. For a more complete understanding, I suggest you read the relevant parts of the C++ Standard: Section 3.1 "Declarations and Definitions", and Section 7.1.1 "Storage Class Specifiers".

This question has already been answered. Start a new discussion instead.