carmldelfobjcopy

How can I make objcopy -Obinary append every text section?


I am trying to use objcopy to get a binary dump of an elf file that has not yet gone through the link stage. It's actually an RP2040 object file cross compiled by gcc version 6.3.1. (The latest version available for 32 bit ARM in the repositories for Ubuntu Bionic).

readelf -a shows the following:

$ readelf -S pico-sdk/src/rp2_common/pico_stdio/stdio.c.obj
There are 69 section headers, starting at offset 0x912c:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000000 00  AX  0   0  2
  [ 2] .data             PROGBITS        00000000 000034 000000 00  WA  0   0  1
  [ 3] .bss              NOBITS          00000000 000034 000000 00  WA  0   0  1
  [ 4] .text.stdio_out_c PROGBITS        00000000 000034 000010 00  AX  0   0  4
  [ 5] .text.stdio_out_c PROGBITS        00000000 000044 0000bc 00  AX  0   0  4
  [ 6] .rel.text.stdio_o REL             00000000 006550 000008 08   I 67   5  4
  [ 7] .text.stdio_buffe PROGBITS        00000000 000100 000064 00  AX  0   0  4
  [ 8] .rel.text.stdio_b REL             00000000 006558 000018 08   I 67   7  4
  [ 9] .text.stdout_seri PROGBITS        00000000 000164 00002c 00  AX  0   0  4
  [10] .rel.text.stdout_ REL             00000000 006570 000018 08   I 67   9  4
  [11] .text.stdout_seri PROGBITS        00000000 000190 000010 00  AX  0   0  4
  [12] .rel.text.stdout_ REL             00000000 006588 000010 08   I 67  11  4
  [13] .text.stdio_put_s PROGBITS        00000000 0001a0 0000f8 00  AX  0   0  4
  [14] .rel.text.stdio_p REL             00000000 006598 000048 08   I 67  13  4
  [15] .text.stdio_get_u PROGBITS        00000000 000298 000084 00  AX  0   0  4
  [16] .rel.text.stdio_g REL             00000000 0065e0 000018 08   I 67  15  4
  [17] .text.stdio_putch PROGBITS        00000000 00031c 000094 00  AX  0   0  4
  [18] .rel.text.stdio_p REL             00000000 0065f8 000038 08   I 67  17  4
  [19] .text.stdio_puts_ PROGBITS        00000000 0003b0 000034 00  AX  0   0  4
  [20] .rel.text.stdio_p REL             00000000 006630 000018 08   I 67  19  4
  [21] .text.stdio_set_d PROGBITS        00000000 0003e4 000030 00  AX  0   0  4
  [22] .rel.text.stdio_s REL             00000000 006648 000008 08   I 67  21  4
  [23] .text.stdio_flush PROGBITS        00000000 000414 000020 00  AX  0   0  4
  [24] .rel.text.stdio_f REL             00000000 006650 000008 08   I 67  23  4
  [25] .text.stdio_init_ PROGBITS        00000000 000434 00000c 00  AX  0   0  4
  [26] .rel.text.stdio_i REL             00000000 006658 000008 08   I 67  25  4
  [27] .text.stdio_deini PROGBITS        00000000 000440 000024 00  AX  0   0  4
  [28] .rel.text.stdio_d REL             00000000 006660 000010 08   I 67  27  4
  [29] .text.stdio_getch PROGBITS        00000000 000464 000094 00  AX  0   0  4
  [30] .rel.text.stdio_g REL             00000000 006670 000020 08   I 67  29  4
  [31] .text.stdio_filte PROGBITS        00000000 0004f8 00000c 00  AX  0   0  4
  [32] .rel.text.stdio_f REL             00000000 006690 000008 08   I 67  31  4
  [33] .text.stdio_set_t PROGBITS        00000000 000504 000010 00  AX  0   0  4
  [34] .text.stdio_set_c PROGBITS        00000000 000514 000028 00  AX  0   0  4
  [35] .rel.text.stdio_s REL             00000000 006698 000008 08   I 67  34  4
  [36] .text.__wrap_getc PROGBITS        00000000 00053c 000074 00  AX  0   0  4
  [37] .rel.text.__wrap_ REL             00000000 0066a0 000020 08   I 67  36  4
  [38] .text.__wrap_putc PROGBITS        00000000 0005b0 000094 00  AX  0   0  4
  [39] .rel.text.__wrap_ REL             00000000 0066c0 000038 08   I 67  38  4
  [40] .text.__wrap_puts PROGBITS        00000000 000644 000034 00  AX  0   0  4
  [41] .rel.text.__wrap_ REL             00000000 0066f8 000018 08   I 67  40  4
  [42] .text.__wrap_vpri PROGBITS        00000000 000678 0000cc 00  AX  0   0  4
  [43] .rel.text.__wrap_ REL             00000000 006710 000048 08   I 67  42  4
  [44] .text.__wrap_prin PROGBITS        00000000 000744 000018 00  AX  0   0  4
  [45] .rel.text.__wrap_ REL             00000000 006758 000008 08   I 67  44  4
  [46] .bss.drivers      NOBITS          00000000 00075c 000004 00  WA  0   0  4
  [47] .bss.filter       NOBITS          00000000 00075c 000004 00  WA  0   0  4
  [48] .mutex_array      PROGBITS        00000000 00075c 000008 00  WA  0   0  4
  [49] .rodata.crlf_str. PROGBITS        00000000 000764 000002 00   A  0   0  4
  [50] .debug_info       PROGBITS        00000000 000766 0020b7 00      0   0  1
  [51] .rel.debug_info   REL             00000000 006760 001380 08   I 67  50  4
  [52] .debug_abbrev     PROGBITS        00000000 00281d 00057f 00      0   0  1
  [53] .debug_loc        PROGBITS        00000000 002d9c 000ba5 00      0   0  1
  [54] .rel.debug_loc    REL             00000000 007ae0 000bf8 08   I 67  53  4
  [55] .debug_aranges    PROGBITS        00000000 003941 0000c8 00      0   0  1
  [56] .rel.debug_arange REL             00000000 0086d8 0000b8 08   I 67  55  4
  [57] .debug_ranges     PROGBITS        00000000 003a09 0002a0 00      0   0  1
  [58] .rel.debug_ranges REL             00000000 008790 000410 08   I 67  57  4
  [59] .debug_line       PROGBITS        00000000 003ca9 0009a6 00      0   0  1
  [60] .rel.debug_line   REL             00000000 008ba0 0000b0 08   I 67  59  4
  [61] .debug_str        PROGBITS        00000000 00464f 00115c 01  MS  0   0  1
  [62] .comment          PROGBITS        00000000 0057ab 000032 01  MS  0   0  1
  [63] .debug_frame      PROGBITS        00000000 0057e0 0002a4 00      0   0  4
  [64] .rel.debug_frame  REL             00000000 008c50 000160 08   I 67  63  4
  [65] .ARM.attributes   ARM_ATTRIBUTES  00000000 005a84 000032 00      0   0  1
  [66] .shstrtab         STRTAB          00000000 008db0 00037a 00      0   0  1
  [67] .symtab           SYMTAB          00000000 005ab8 0007f0 10     68  93  4
  [68] .strtab           STRTAB          00000000 0062a8 0002a7 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  y (purecode), p (processor specific)

What I expect is that

arm-none-eabi-objcopy -Obinary pico-sdk/src/rp2_common/pico_stdio/stdio.c.obj bin.out

will include every section that is PROGBITS and has an 'A' flag. This corresponds to 26 sections with a total size of 1842 bytes. (This is from my own ELF parser which I think makes it easier to read.)

Text Sections:
Offset      Length      Name
---------------------------------------------
0x00000034  0x00000000  ".text"
0x00000034  0x00000000  ".data"
0x00000034  0x00000010  ".text.stdio_out_chars_no_crlf"
0x00000044  0x000000bc  ".text.stdio_out_chars_crlf"
0x00000100  0x00000064  ".text.stdio_buffered_printer"
0x00000164  0x0000002c  ".text.stdout_serialize_begin"
0x00000190  0x00000010  ".text.stdout_serialize_end"
0x000001a0  0x000000f8  ".text.stdio_put_string"
0x00000298  0x00000084  ".text.stdio_get_until"
0x0000031c  0x00000094  ".text.stdio_putchar_raw"
0x000003b0  0x00000034  ".text.stdio_puts_raw"
0x000003e4  0x00000030  ".text.stdio_set_driver_enabled"
0x00000414  0x00000020  ".text.stdio_flush"
0x00000434  0x0000000c  ".text.stdio_init_all"
0x00000440  0x00000024  ".text.stdio_deinit_all"
0x00000464  0x00000094  ".text.stdio_getchar_timeout_us"
0x000004f8  0x0000000c  ".text.stdio_filter_driver"
0x00000504  0x00000010  ".text.stdio_set_translate_crlf"
0x00000514  0x00000028  ".text.stdio_set_chars_available_callback"
0x0000053c  0x00000074  ".text.__wrap_getchar"
0x000005b0  0x00000094  ".text.__wrap_putchar"
0x00000644  0x00000034  ".text.__wrap_puts"
0x00000678  0x000000cc  ".text.__wrap_vprintf"
0x00000744  0x00000018  ".text.__wrap_printf"
0x0000075c  0x00000008  ".mutex_array"
0x00000764  0x00000002  ".rodata.crlf_str.6304"

Data Sections:
Offset      Length      Name
---------------------------------------------
0x00000034  0x00000000  ".bss"
0x0000075c  0x00000004  ".bss.drivers"
0x0000075c  0x00000004  ".bss.filter"

Total code size = 1842 in 26 sections.
Total data size = 8 in 3 sections.

However, what I get from objcopy is actually a bin.out file that is only 248 bytes long.

$ ls -l bin.out
-rw-rw-r-- 1 devel devel 248 Oct  6 03:53 bin.out

After racking my brain for several days, I realized that what objcopy is generating is only the longest TEXT section (".text.stdio_put_string") out of the 26. It isn't actually appending each text section to the one before it.

I can't find an option in objcopy that does what I want. I've tried options like --gap-fill=0. Everything just results in the same 248 byte file. Does anyone know if there is a way to resolve this issue? I would really like to come up with a way to generate binary files from this that use standard tools.

Thank you for any advice.

(In case anyone is curious, this is for tracking digital signatures of precursor files that get compiled into a binary.)


Solution

  • After racking my brain for several days, I realized that what objcopy is generating is only the longest TEXT section (".text.stdio_put_string") out of the 26. It isn't actually appending each text section to the one before it.

    You're close, but not quite right.

    By default objcopy copies sections that have non-zero size, have the ALLOC flag, and a type not equal to NOBITS. Call such a section an image-section. Note than an image section is not necessarily a PROGBITS section, e.g. a NOTE section may be an image-section. Your objcopy -Obinary command outputs N bytes where N is the size of the largest image-section (which in your case happens to be the largest .text* section), but these N bytes are not that section. They are the garbage that results from outputting each eligible section, in their ELF section order, each on top of the last - i.e. all of them aligned to the start of the raw binary.

    A demo of that.

    If you don't need convincing you can skip to The objcopy solution

    From man objcopy, we read:

    -j sectionpattern --only-section=sectionpattern

    Copy only the indicated sections from the input file to the output file. This option may be given more than once. Note that using this option inappropriately may make the output file unusable. Wildcard characters are accepted in sectionpattern.

    If the first character of sectionpattern is the exclamation point (!) then matching sections will not be copied, even if earlier use of --only-section on the same command line would otherwise copy it. For example:

    --only-section=.text.* --only-section=!.text.foo

    will copy all sectinos matching '.text.*' but not the section '.text.foo'.

    This would suggest that, e.g.

    $ objcopy -Obinary -j .text* -j .data* -j .eh_frame file.o out.bin
    

    would concatenate all the sections with names matching .text*, .data* or .eh_frame from file.o into out.bin.

    Let's see:

    $ cat file.c
    int rr[3] = {2,5,7};
    int vv[4] = {11,13,17,19};
    
    int aa(int a, int b) {
        return a + b;
    }
    
    void bb(int * a, int * b, unsigned sz) {
        for (unsigned i = 0; i < sz; ++i) {
            b[i] = a[i];
        }
    }
    
    unsigned cc(unsigned a) {
        unsigned b = 0;
        for ( ; a; b+=a, --a){};
        return b;
    }
    

    Compile file.o with fine-grained sections:

    $ gcc -c -ffunction-sections -fdata-sections file.c
    $ readelf -WS file.o
    There are 17 section headers, starting at offset 0x3a0:
    
    Section Headers:
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
      [ 1] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  1
      [ 2] .data             PROGBITS        0000000000000000 000040 000000 00  WA  0   0  1
      [ 3] .bss              NOBITS          0000000000000000 000040 000000 00  WA  0   0  1
      [ 4] .data.rr          PROGBITS        0000000000000000 000040 00000c 00  WA  0   0  8
      [ 5] .data.vv          PROGBITS        0000000000000000 000050 000010 00  WA  0   0 16
      [ 6] .text.aa          PROGBITS        0000000000000000 000060 000018 00  AX  0   0  1
      [ 7] .text.bb          PROGBITS        0000000000000000 000078 000054 00  AX  0   0  1
      [ 8] .text.cc          PROGBITS        0000000000000000 0000cc 000029 00  AX  0   0  1
      [ 9] .comment          PROGBITS        0000000000000000 0000f5 000027 01  MS  0   0  1
      [10] .note.GNU-stack   PROGBITS        0000000000000000 00011c 000000 00      0   0  1
      [11] .note.gnu.property NOTE            0000000000000000 000120 000020 00   A  0   0  8
      [12] .eh_frame         PROGBITS        0000000000000000 000140 000078 00   A  0   0  8
      [13] .rela.eh_frame    RELA            0000000000000000 0002c0 000048 18   I 14  12  8
      [14] .symtab           SYMTAB          0000000000000000 0001b8 0000f0 18     15   5  8
      [15] .strtab           STRTAB          0000000000000000 0002a8 000017 00      0   0  1
      [16] .shstrtab         STRTAB          0000000000000000 000308 000094 00      0   0  1
    Key to Flags:
      W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
      L (link order), O (extra OS processing required), G (group), T (TLS),
      C (compressed), x (unknown), o (OS specific), E (exclude),
      D (mbind), l (large), p (processor specific)
    

    The aggregate size of the image-sections (.text* + .data* + .eh_frame + .note.gnu.property) is:

    0xc + 0x10 + 0x18 + 0x54 + 0x29 + 0x20 + 0x78 = 0x149 = 329 bytes. 
    

    However:

    $ objcopy -Obinary -j .text* -j *.data -j .eh_frame -j .note.gnu.propery file.o image.bin
    $ stat -c "%s" image.bin
    120
    

    Only 120 = 0x78 bytes have been output - the size of the largest matching section, .eh_frame. And the output of:

    $ objcopy -Obinary -j .eh_frame file.o eh_frame.bin
    

    is identical with image.bin

    $ cmp image.bin eh_frame.bin; echo Done
    Done
    

    But in this case the largest image-section is also the last image-section in the file. So prima facie it might constitute the whole contents of image.bin just because the largest section was chosen or because the last section overlaid at the start of the output happened to be the largest one.

    We can decide between these cases by eliminating .eh_frame from the output, so that the largest image-section is not the last. In fact I'll eliminate all but the .text* and .data* sections, which will shorten matters a bit and have the same effect.

    $ objcopy -Obinary -j .text* -j .data* file.o text+data.bin
    

    In text+data.bin, the largest section will be .text.bb, size 0x54 = 84 bytes. The aggregate size of the .text* + .data* sections is:

    0xc + 0x10 + 0x18 + 0x54 + 0x29 = 0xB1 = 177 bytes
     
    

    The size of the text+data.bin, as we now expect, isn't that; it is:

    $ stat -c "%s" text+data.bin
    84
    

    Let's have .text.bb in a file by itself:

    $ objcopy -Obinary -j .text.bb file.o bb.bin
    

    It's the same size as text+data.bin:

    $ stat -c "%s" bb.bin
    84
    

    But it is not identical:

    $ cmp text+data.bin bb.bin
    text+data.bin bb.bin differ: byte 9, line 1
    

    That refutes the theory that the largest section is chosen.

    text.bb is not the last section selected for text+data.bin: that is .text.cc, size 0x29 = 41 bytes.

    So let's have .text.cc in a file by itself:

    $ objcopy -Obinary -j .text.cc file.o cc.bin
    $ stat -c "%s" cc.bin
    41
    

    And see that:

    $ cmp -n41 text+data.bin cc.bin; echo Done
    Done
    

    The first 41 bytes of text+data.bin are the section .text.cc. .text.cc is the 2nd largest selected section. So let's see if the remaining 84-41 = 43 bytes of text+data.bin are the last 43 bytes of .text.bb:

    $ dd if=text+data.bin of=tail-text+data.bin bs=41 skip=1
    1+1 records in
    1+1 records out
    43 bytes copied, 8.6186e-05 s, 499 kB/s
    
    $ dd if=bb.bin of=tail-bb.bin bs=41 skip=1
    1+1 records in
    1+1 records out
    43 bytes copied, 0.000129071 s, 333 kB/s
    
    $ cmp tail-text+data.bin tail-bb.bin; echo done
    done
    

    And so they are. For objcopy -Obinary, the selected sections are overlaid, in ELF order, at the start of the output file.

    The objcopy solution

    In the light of that finding, we can read between the lines of the paragraph that man objcopy devotes to to explaining -O binary:

    objcopy can be used to generate a raw binary file by using an output target of binary (e.g., use -O binary). When objcopy generates a raw binary file, it will essentially produce a memory dump of the contents of the input object file. All symbols and relocation information will be discarded. The memory dump will start at the load address of the lowest section copied into the output file.

    [my emphasis]

    The emphasised clause implies that the section load addresses found in the object file will be used as the file offsets of the respective sections in the output file. In file.o, an unlinked file, those addresses are of course all 0. So all the sections are output at start-of-file.

    A remedy for this would be to change the output section addresses so as to lay out the sections consecutively. objcopy has an option for that purpose:

    --change-section-address sectionpattern{=,+,-}val --adjust-section-vma sectionpattern{=,+,-}val

    Set or change both the VMA address and the LMA address of any section matching sectionpattern. If = is used, the section address is set to val. Otherwise, val is added to or subtracted from the section address. See the comments under --change-addresses, above. If sectionpattern does not match any sections in the input file, a warning will be issued, unless --no-change-warnings is used.

    So, to the first selected section we can assign address 0x0, then to each subsequent selected section assign an address equal to that of the previous section + its size. That would be like:

    $ objcopy -O binary -j .data* -j .text* --change-section-address .data.rr=0x0 \
    --change-section-address .data.vv=0xc --change-section-address .text.aa=0x1C \
    --change-section-address .text.bb=0x34 --change-section-address .text.cc=0x88 \
    file.o text+data-redux.bin
        
    

    With which:

    $ stat -c "%s" text+data-redux.bin
    177
    

    is exactly the noted size of the .text* + .data* sections.

    We can prove that the selected sections are consecutive and correct:-

    Need the rest of the single-section binaries:

    $ objcopy -Obinary -j .data.rr file.o rr.bin
    $ objcopy -Obinary -j .data.vv file.o vv.bin
    $ objcopy -Obinary -j .text.aa file.o aa.bin
    

    .data.rr = 1st section copied, length 12:

    $ cmp text+data-redux.bin rr.bin
    cmp: EOF on rr.bin after byte 12, in line 1 
    

    Matched first 12 bytes. Discard them:

    $  dd if=text+data-redux.bin of=tail-rr.bin bs=12 skip=1
    13+1 records in
    13+1 records out
    165 bytes copied, 0.000108776 s, 1.5 MB/s
    

    .data.vv = 2nd section copied, length 16:

    $ cmp tail-rr.bin vv.bin
    cmp: EOF on vv.bin after byte 16, in line 1
    

    Matched next 16 bytes. Discard them:

    $ dd if=tail-rr.bin of=tail-vv.bin bs=16 skip=1
    9+1 records in
    9+1 records out
    149 bytes copied, 0.000169584 s, 879 kB/s
    

    .text.aa = 3rd section copied, length 24:

    $ cmp tail-vv.bin aa.bin
    cmp: EOF on aa.bin after byte 24, in line 1
    

    Matched next 24 bytes. Discard them:

    $ dd if=tail-vv.bin of=tail-aa.bin bs=24 skip=1
    5+1 records in
    5+1 records out
    125 bytes copied, 8.5904e-05 s, 1.5 MB/s
    

    .text.bb = 4th section copied, length 84:

    $ cmp tail-aa.bin bb.bin
    cmp: EOF on bb.bin after byte 84, in line 1
    

    Matched next 84 bytes. Discard them:

    $ dd if=tail-aa.bin of=tail-bb.bin bs=84 skip=1
    0+1 records in
    0+1 records out
    41 bytes copied, 0.00014661 s, 280 kB/s
    

    .text.cc = last section copied, length 41:

    $ cmp tail-bb.bin cc.bin; echo Done
    Done
    

    Matched last 41 bytes

    Automation

    Unaided by automation, this solution grows rapidly unwieldy with the number of sections to be copied, which just with -function-sections/-fdata-sections compilations can be arbitrarily large.

    You say that you want a solution using "standard tools". If that means at most a pipe of stock commands I think you're out of luck, but at this point I expect you'd consider a bash script. Here's one that harnesses objcopy -O binary --change-section-address ... to write the image-sections of an input ELF file to an output file consecutively without gaps1.

    $ cat cat_elf_image_sections.sh 
    #!/bin/bash
    # cat_elf_image_sections.sh
    # concatenate sections of ELF file that are non-0 size, type != `NOBITS` and flags inc. `A` into an output file
    # $1 = input file
    # $2 = output file
    
    rm -f $2
    tot_sz=0
    idx=0
    section_addr=0x0
    objcopy_cmd="objcopy -O binary"
    stats=($(readelf -WS $1 | awk '$3 != "NOBITS" && $8 ~ "A" { print $2 " " $6}; $4 != "NOBITS" && $9 ~ "A" { print $3 " "  $7 }'))
    for ((idx=0;idx< ${#stats[@]} ;idx+=2));
    do
        section=${stats[idx]}
        sz_str=${stats[idx + 1]}
        (( section_sz=16#$sz_str ))
        printf "section %s : size %u bytes" $section $section_sz
        if [[ $section_sz -gt 0 ]]; then
            (( tot_sz+=section_sz ))
            hex_section_addr=$(printf "0x%x" $section_addr)
            printf " : output offset %u\n" $section_addr
            objcopy_cmd+=" --change-section-address $section=$hex_section_addr"
            (( section_addr+=section_sz ))
        else
            printf "\n"
        fi
    done
    echo "Total size " $tot_sz " bytes"
    objcopy_cmd+=" $1 $2"
    $objcopy_cmd
    

    Trying it:

    $ ./cat_elf_image_sections.sh file.o file1.bin
    section .text : size 0 bytes
    section .data : size 0 bytes
    section .data.rr : size 12 bytes : output offset 0
    section .data.vv : size 16 bytes : output offset 12
    section .text.aa : size 24 bytes : output offset 28
    section .text.bb : size 84 bytes : output offset 52
    section .text.cc : size 41 bytes : output offset 136
    section .note.gnu.property : size 32 bytes : output offset 177
    section .eh_frame : size 120 bytes : output offset 209
    Total size  329  bytes
    
    $ stat -c "%s" file1.bin
    329
    

    We calculated earlier that the aggregate size of the image-sections in file.o is 329 bytes.


    1. I assume you're aware of and don't care, for your present purpose, about the fact that the image-sections copied from an object file and packed consecutively into an output file will not necessarily be an exact fit for the corresponding sections they would be merged into by the linker in a program or shared library image, which may have interpolated padding to satisfy alignment requirements and sizes adjusted accordingly. For example, the aggregate size of the .data* sections in file.o is 28 bytes, but the linker on my system will merge them into a 32 byte output .data section with 4 bytes of alignment padding.