gcclinkerdeclarationextern

identical variables declared 'extern' in multiple translation units can be unique entities?


According to the C99 standard:

In the set of translation units and libraries that constitutes an entire program, each
declaration of a particular identifier with external linkage denotes the same object or
function.

Now I am not a native English speaker, but to me this essentially means that if I have extern int var; in two translation units, the linker will link them together so that both declarations will end up linking to the same object when such is defined anywhere.

If this is true, then why if I purposely change the type of one of the declarations the compiler/linker does not produce a diagnostic that the types are conflicting, but it does so if the two extern declarations are declared in the same translation unit?

It even allows me to define the variable in the same translation unit:


extern int var;


extern float var;
float var = 2.5;

But if I also define var in TU1, the linker basically crashes, without providing any diagnostic again. The compilation flags are:

-Wall -Wextra -pedantic -Werror=shadow -std=c99

My question can be wrapped up by asking why does GCC behave like that (misdiagnosing on both occasions) and what should be the ideal, standard-conforming treatment for a code like this?


Solution

  • why if I purposely change the type of one of the declarations the compiler/linker does not produce a diagnostic that the types are conflicting,

    The types of symbols are lost when linking. Linker only sees names of symbols. And matches the names.

    why

    It's 2022 nowadays. LTO and C++ modules were introduced for a reason, and we have many more sophisticated programming languages. C is a 50 years old language. There was not enough computer power and man power when C programming language and linker were invented to let it be aware of symbol types. It was just not implemented.

    Note that the document you are reading does not require a diagnostic, in the case this rule is broken. In case the rule is broken, you get "undefined behavior". Undefined behavior means there is no requirement on what will happen, you may get a diagnostic, or spawn nasal demons. You can't "expect" a diagnostic. See wikipedia undefined behavior and Undefined, unspecified and implementation-defined behavior .

    Also note that you are reading C99, which is 20 years old. Consider C11, or C23.

    it does so if the two extern declarations are declared in the same translation unit?

    Because the compiler sees the whole translation unit when compiling, it sees the symbol conflict and is able to say something about it. Or maybe in other words, because 5.1.1.3p1 requires that a diagnostic is issued in such case.

    what should be the ideal, standard-conforming treatment for a code like this?

    Complete ignore such case. Assume it will never happen.

    It is the other way round. A standard-conforming compiler is required to compile standard-conforming programs. There is no requirements on the result of compilation of any other programs. If you feed a program with undefined behavior to any conforming compiler, anything is allowed to happen. Compilers may assume the programmer will never write a program with undefined behavior, so they can completely ignore that such a case will ever happen. It is programmers fault that he has written invalid program and that programmer should fix it.