c linux gcc ld elf

Why does the .data.rel.ro section takes so much space in my executable?

I want to build a shared library on Linux which contains a big initialized array and use this array in different executables. I'd expect this to allow to reduce the compilation output size epecially if several programs use this data. Unfortunately, this seems to not be the case when the data in the shared object is marked as read-only.

Here is my "tab" symbol of 4Mib inside the object:

$ nm --print-size bigfile.o
0000000000000000 0000000000400000 R tab

I use ld to create a shared object:

ld -o libbigfile.so -shared bigfile.o

And this result in a 4M executable when linked with gcc -o bigfile main.o libbigfile.so

And the responsible for this seems to be the .data.rel.ro

$ readelf --section-headers bigfile
[21] .data.rel.ro      PROGBITS         0000000000403dc0  00002dc0
0000000000400000       0000000000000000 WA       0     0     64

But as i can inspect with readelf -x .data.rel.ro bigfile the .data.rel.ro is full of 0x00. So if the content of the .rodata section of a shared object if only copied at load time, why does it takes that space in the executable binary instead of being allocated at load time as .bss does ?

How to reproduce:

I have a very simple main:

#include <stdio.h>

extern char tab[];

int main() {
  puts(tab); 
  return 0;
}

I produce my shared object from a C or assembly file but the assembly file is smaller (no "times" prefix in C sadly) so here is the assembly version:

global tab:data BYTESIZE

BYTESIZE equ (1 << 22)

section .rodata
align 64
tab:
    times (BYTESIZE - 2) db 'A'
    db 0xA
    db 0x0

And to build it:

nasm -f elf64 bigfile.asm
ld -o libbigfile.so -shared bigfile.o
gcc -c main.c
gcc -o bigfile main.o libbigfile.so

Note: If i put the tab symbol in the .data section the size problem disappear.

Solution

why does it takes that space in the executable binary instead of being allocated at load time as .bss does ?

It's a bug (or rather a deficiency) in the GNU ld -- it didn't have to make a copy.

The reason the linker has to make a copy of the data (symbol) in the main executable is explained here -- copy relocations.

But there is no reason for the linker to make a copy of the symbol contents, and indeed LLD does not suffer from the same deficiency.

GNU-ld doesn't suffer from this either when the data is writable (as you noted), but does when the data is read-only.

// tab.c
const char tab[0x400000] = {'a'};

// main.c
#include <stdio.h>
extern const char tab[];
int main() { puts(tab); return 0; }

gcc -fPIC -shared -o tab.so tab.c && gcc main.c ./tab.so

readelf -Ws a.out | grep ' tab$'
     4: 0000000000403de0 0x400000 OBJECT  GLOBAL DEFAULT   21 tab
    32: 0000000000403de0 0x400000 OBJECT  GLOBAL DEFAULT   21 tab 

readelf -WS a.out | grep '\[21\]'
  [21] .data.rel.ro      PROGBITS        0000000000403760 002760 12d687 00  WA  0   0 32

As you can see, GNU-ld puts a copy of symbol contents into .data.rel.ro.

Now let's try using a different linker:

gcc main.c ./tab.so -fuse-ld=lld

ls -l a.out
rwxr-xr-x 1 user  6248 Dec  3 20:48 a.out

readelf -Ws a.out | grep ' tab$'
     6: 00000000002028a0 0x12d687 OBJECT  GLOBAL DEFAULT   22 tab
    28: 00000000002028a0 0x12d687 OBJECT  GLOBAL DEFAULT   22 tab

readelf -WS a.out | grep '\[22\]'
  [22] .bss.rel.ro       NOBITS          00000000002028a0 0008a0 400000 00  WA  0   0 32

LLD does not make the unnecessary copy, resulting in a much smaller executable.

Gold does the same as GNU-ld.

Note that even though LLD doesn't make a copy in the executable, a copy will be made at runtime (at the binary startup time), so this is not ideal.

A possible solution is to build all your binaries with -fPIC, but that is not ideal either, because -fPIC code is slower and larger.

In your question you state that you want to reduce compilation output size since several programs use the same tab[] array.

If your goal is to reduce the total size occupied by N different programs and all these programs are shipped together, the best solution might be to avoid the shared library, and instead link all of these programs into a single binary, BusyBox style.