flatbuffers

How are strings written down in FlatBuffers


I am researching FlatBuffers file structure and I want to know how are strings written down. From what I could gather, the string orc (for example) is written down as letters count in little endian (0x3 0x0 0x0 0x0) followed by the actual letters and followed by something else. I am trying to understand what the something else is. What bytes follow the letters? I am only asking about the presentation of this specific string in buffer/file.


Solution

  • According to the FlatBuffer documentation:

    "Strings are simply a vector of bytes, and are always null-terminated. Vectors are stored as contiguous aligned scalar elements prefixed by a 32bit element count (not including any null termination). Neither is stored inline in their parent, but are referred to by offset. A vector may consist of more than one offset pointing to the same value if the user explicitly serializes the same offset twice."

    Thus, the 4 bytes at the front are the 32 bit element count, and 0x3 0x0 0x0 0x0 would mean that there are 3 bytes in the string excluding zero termination. (FlatBuffer defaults to little-endian; see link above.)