clinuxbinaryelf

Why in ELF binaries the bytes are inverted in groups of two


I am trying to create and ELF format header editor. During the development I noted that in the binary groups of two bytes are always inverted.

Here is an hexdump for example ( I will call it hexdump1 for reference).

pc@pc-VirtualBox:~/Documents/ElfEditor$ hexdump Test | head
0000000 457f 464c 0102 0001 0000 0000 0000 0000
0000010 0003 003e 0001 0000 07a0 0000 0000 0000
0000020 0040 0000 0000 0000 2758 0000 0000 0000
0000030 0000 0000 0040 0038 0009 0040 0022 0021
0000040 0006 0000 0004 0000 0040 0000 0000 0000
0000050 0040 0000 0000 0000 0040 0000 0000 0000
0000060 01f8 0000 0000 0000 01f8 0000 0000 0000
0000070 0008 0000 0000 0000 0003 0000 0004 0000
0000080 0238 0000 0000 0000 0238 0000 0000 0000
0000090 0238 0000 0000 0000 001c 0000 0000 0000

For example in the first 4 bytes I was expecting "7f45 4c46" and not "457f 464c".

When I run the hexdump with the -C argument I get the dump in the way I am expecting. ( I will call this hexdump as hexdump2 for reference).

00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  03 00 3e 00 01 00 00 00  a0 07 00 00 00 00 00 00  |..>.............|
00000020  40 00 00 00 00 00 00 00  58 27 00 00 00 00 00 00  |@.......X'......|
00000030  00 00 00 00 40 00 38 00  09 00 40 00 22 00 21 00  |....@.8...@.".!.|
00000040  06 00 00 00 04 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000050  40 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |@.......@.......|
00000060  f8 01 00 00 00 00 00 00  f8 01 00 00 00 00 00 00  |................|
00000070  08 00 00 00 00 00 00 00  03 00 00 00 04 00 00 00  |................|
00000080  38 02 00 00 00 00 00 00  38 02 00 00 00 00 00 00  |8.......8.......|
00000090  38 02 00 00 00 00 00 00  1c 00 00 00 00 00 00 00  |8...............|

The binary is saved as hexdump1, which make it hard to read in C.

Just a few additional info. It is not an endian problem because the data that is related, for example integers, are correct in little endian format in hexdump2 and incorrect in hexdump1. For instance in offset byte 0x14 start a 32 bit Int (value is 1), in the hexdump2 it is correctly represented by the bytes 01 00 00 00 and in hexdump1 it is incorrectly (bytes 00 01 00 00).

So my doubts are:

  1. Is this behavior normal?
  2. Why it happens?
  3. Does it happens this way in all Linux distributions and architectures?
  4. Can I just invert the bytes in the entire file (including the rest of the binary)?
  5. Is there an easy or correct way to correct the bytes order?

I would like to make my program flexible to run in any Linux distribution or architecture. Thx for the help in advance.


Solution

  • The field

    e_ident[EI_DATA]
    

    defines the endianess of the elf binary.

    Depending on the endianess, the fields that have the size bigger than 1 will be inverted or not.