c++binaryendiannesshexdumpxxd

Why does od and my C++ code read in a different endianness than what is rendered by hex editors?


I noticed an odd behavior where od -H and Vim's hex editor (open a file and use the command :%!xxd) display different endianness for the same data. I wrote some C++ code that dumps the first uint32_t from a file, and its endianness matches that of od instead of what is displayed in the hex editor:

dump.cc:

#include <cstdio>
#include <iostream>
#include <stdexcept>
#include <vector>

std::vector<uint8_t> ReadFile(const std::string &filename) {
  FILE *file = fopen(filename.c_str(), "rb");
  if (file == NULL) {
    throw std::runtime_error("Error opening file: " + filename);
  }

  fseek(file, 0L, SEEK_END);
  size_t file_size = ftell(file);
  rewind(file);

  std::vector<uint8_t> buffer(file_size);
  size_t bytes_read = fread(buffer.data(), 1, file_size, file);
  if (bytes_read != file_size) {
    fclose(file);
    throw std::runtime_error("Error reading file: " + filename);
  }
  fclose(file);
  return buffer;
}

int main(int argc, char **argv) {
  if (argc != 2) {
    std::cerr << "usage: dump FILE" << std::endl;
    return EXIT_FAILURE;
  }
  const char *filename = argv[1];
  const std::vector<uint8_t> buf = ReadFile(filename);

  uint32_t first_int;
  memcpy(&first_int, buf.data(), sizeof(uint32_t));
  std::cout << std::hex << first_int << std::endl;

  return EXIT_SUCCESS;
}

Compile and run:

$ g++ ./dump.cc -o dump
$ ./dump ./dump.cc
636e6923

In comparison, here are the first two lines of od -H:

$ od -H ./dump.cc | head -n 2
0000000          636e6923        6564756c        73633c20        6f696474
0000020          69230a3e        756c636e        3c206564        74736f69

On the other hand, here is what Vim displays:

00000000: 2369 6e63 6c75 6465 203c 6373 7464 696f  #include <cstdio
00000010: 3e0a 2369 6e63 6c75 6465 203c 696f 7374  >.#include <iost

I also opened the file in a hex editor app and it is rendering in the same endianness that Vim displays:

 0    23 69 6e 63 6c 75 64 65 20 3c 63 73 74 64 69 6f 3e 0a 23 69
20    6e 63 6c 75 64 65 20 3c 69 6f 73 74 72 65 61 6d 3e 0a 23 69

Why is that od and my code displaying a different endianness? How do I get my code to read in the same endianness that these hex editors are displaying?

I am on macOS 14 on Apple Silicon; however, I am observing the same behavior on Ubuntu running on Windows 11 WSL on x86.

Thank you in advance.


Solution

  • vim and your hex editor are working at byte level, showing them in the order they are in the file.

    od interprets the sequence of bytes. Option -H read four bytes and interpret them as a 32-bits (four bytes) int. You must know that there exists different mapping of the bytes of an int in memory (it is just like writing something on a paper, L-to-R or R-to-L), basically two:

    The file starts with 23 69 6e 63 but as your platforms are little endian (x86 is little endian as silicon) the int is 0x63*256^3 + 0x6e*256^2 + 0x69*256^1 + 0x23*256^0.

    You may use od to read byte by byte with od -tx1.