tlvkaitai-struct

Making a basic TLV parser in Kaitai


I can't figure out how to write a simple TLV definition in Kaitai Struct.

I reduced the problem to a simple TLV file where the following repeats until the end of file:

This definition:

meta:
  id: basictlv
  endian: le
types:
  bunch_of_ones:
    seq:
      - id: how_many_1
        type: u1
      - id: ones
        size: how_many_1
  bunch_of_twos:
    seq:
      - id: how_many_2
        type: u1
      - id: twos
        size: how_many_2
seq:
  - id: type_tag
    type: u1
  - id: body
    type:
      switch-on: type_tag
      cases:
        1: bunch_of_ones
        2: bunch_of_twos

Used on this file

$ xxd simple-tlv-file.dat
00000000: 0102 0101 0203 0202 0203 0103            ............

Switches on the type, but it only parse the first TLV and does not repeat.

$ ksdump simple-tlv-file.dat sample-tlv.yaml 2> /dev/null
body:
  how_many_1: 2
  ones: 01 01
type_tag: 1

Solution

  • To parse a basic TLV with Kaitai Struct, you must break down all the possibilities as types, and in your main sequence (aka seq) you read the top level "object" until eos (or some other mean to mark its end). This way, you can also add a header or other data in the file before the actual TLV structure.

    So in your example, move the TLV block to the types section:

    types:
      block_of_data:
        seq:
        - id: type_tag
          type: u1
        - id: body
          type:
            switch-on: type_tag
            cases:
              1: bunch_of_ones
              2: bunch_of_twos
    

    You can also add another type, like a header to the types section

      hello_header:
        seq:
          - id: magic
            contents:
              - 0x48
              - 0x45
              - 0x4C
              - 0x4C
              - 0x4F
    

    And you will be able to parse this file:

    $ xxd ~/downloads/simple-tlv-file.dat
    00000000: 4845 4c4c 4f02 0101 0203 0202 0201 0201  HELLO...........
    00000010: 01
    

    With a main seq like this one:

    seq:
      - id: header
        type: hello_header
      - id: data
        type: block_of_data
        repeat: eos