cgccnmlto

Incorrect output of `nm` on GCC LTO fat object files


If I have tmp.c:

char constantFOO[0x12];
char constantBAR[0x34];

I see gcc -c tmp.c -o tmp.o && nm tmp.o shows

0000000000000034 C constantBAR
0000000000000012 C constantFOO

But if I compile with -flto -ffat-lto-objects, nm outputs zeros for the symbol values:

00000000 C constantBAR
00000000 C constantFOO

I can the 34 and 12 values in a hexdump of both .o files.

My questions are

  1. Is the behavior of nm on the LTO fat file expected? Am I just giving it input it's not expecting and it's outputting garbage?

  2. What explains the original output (symbol value matching uninitialized array length)? This question didn't seem to help for the question of arrays, but maybe I misunderstood.


Solution

  • I compiled your tmp.c both with and without -flto -ffat-lto-objects, in -S mode (output assembly language), using GCC 8.3. In both cases, the same basic definitions of your constants are emitted:

        .comm   constantFOO,18,16
        .comm   constantBAR,52,32
    

    Most of the additional data emitted by LTO goes into ELF sections named .gnu.lto_.something. LTO mode adds an additional marker object:

       .comm   __gnu_lto_v1,1,1
    

    appears in the LTO-compiled object but not in the one without.

    On its face, this shouldn't affect the output of nm for these symbols at all, and the lower-level tool readelf -s produces matching output for them:

    $ readelf -s tmp-normal.o
    
    Symbol table '.symtab' contains 9 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.c
         2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
         3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
         4: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
         5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
         6: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
         7: 0000000000000010    18 OBJECT  GLOBAL DEFAULT  COM constantFOO
         8: 0000000000000020    52 OBJECT  GLOBAL DEFAULT  COM constantBAR
    
    $ readelf -s tmp-lto.o
    
    Symbol table '.symtab' contains 17 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS test.c
         2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
         3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
         4: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
         5: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
         6: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
         7: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
         8: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
         9: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
        10: 0000000000000000     0 SECTION LOCAL  DEFAULT    9 
        11: 0000000000000000     0 SECTION LOCAL  DEFAULT   10 
        12: 0000000000000000     0 SECTION LOCAL  DEFAULT   12 
        13: 0000000000000000     0 SECTION LOCAL  DEFAULT   11 
        14: 0000000000000010    18 OBJECT  GLOBAL DEFAULT  COM constantFOO
        15: 0000000000000020    52 OBJECT  GLOBAL DEFAULT  COM constantBAR
        16: 0000000000000001     1 OBJECT  GLOBAL DEFAULT  COM __gnu_lto_v1
    

    Therefore I believe that the behavior of nm is a bug, which should be reported to the maintainers of GNU binutils (see https://sourceware.org/binutils/).

    As for the "original output" with symbol value matching array length, what's going on is that normally a symbol's value as shown by nm is its offset within its section of the object file. Common symbols, however, are not in any section and do not have an offset, so nm prints the size of the symbol as its value. This is, IIRC, historical behavior going all the way back to whichever iteration of System V added support for FORTRAN-like common data. Notice how readelf -s prints 18 and 52 as the sizes of the objects, and the third argument to .comm (the desired alignment of each symbol) as their values.

    If you compile with -fno-common you will see different output:

    $ gcc -c -fno-common tmp.c -o tmp-nc.o
    $ nm tmp-nc.o 
    0000000000000020 B constantBAR
    0000000000000000 B constantFOO
    $ readelf -s tmp-nc.o | grep constant
         7: 0000000000000000    18 OBJECT  GLOBAL DEFAULT    3 constantFOO
         8: 0000000000000020    52 OBJECT  GLOBAL DEFAULT    3 constantBAR
    

    because now your arrays are in the .bss section and have a defined offset within that section.

    Note that char constantFOO[0x12]; defines a writable array of 0x12 chars. If you want it actually to be constant you need to say const char. (And then it will be put in the .rodata section of the object file and the output of nm and readelf will be different yet again.)