pythonid3kaitai-struct

Python: reading ID3v1 tag with Kaitai Struct


I'm trying to get Kaitai Struct to parse a ID3v1 tag format for MP3s. According to the standard, it is a fixed format structure located at the certain offset - but the trick is that this offset is calculated not from the beginning of the file, but from the end.

Here's the basic .ksy outline of the tag, I take it for granted that it shouldn't really change:

meta:
  id: id3v1
types:
  id3v1_tag:
    seq:
      - id: magic
        contents: 'TAG'
      - id: title
        size: 30
      - id: artist
        size: 30
      - id: album
        size: 30
      - id: year
        size: 4
      - id: comment
        size: 30
      - id: genre
        type: u1

and here's my naïve idea on how to get it to be read from the 128 bytes till the end of the file:

instances:
  tag:
    pos: -128
    type: id3v1_tag

I try that with a simple Python test script:

#!/usr/bin/env python

from id3v1 import *

f = Id3v1.from_file('some_file_with_id3.mp3')
print(f.tag)

However, it seems to pass that negative amount directly into the Python's File object seek() and thus fails:

Traceback (most recent call last): File "try-id3.py", line 6, in print(f.id3v1_tag) File "id3v1_1.py", line 171, in id3v1_tag self._io.seek(-128) File "kaitaistruct.py", line 29, in seek self._io.seek(n) IOError: [Errno 22] Invalid argument

After a few other equally insane ideas, I've found a workaround: I can just omit any pos arguments in .ksy and then I manually seek to the proper position in my script:

f = Id3v1.from_file('some_file_with_id3.mp3')
f._io.seek(-128, 2)
print(f.tag.title)

This works, but feels really hackish :( Is there a better way to do it in Kaitai Struct and Python?


Solution

  • There's a new feature in upcoming v0.4 of Kaitai Struct that addresses exactly this issue. You can use _io to get current stream object and then you can use .size to get full length of current stream in bytes. Thus, if you'd want to address some structure by a fixed offset from the end of the stream, you'd want to use something like in your .ksy:

    instances:
      tag:
        pos: _io.size - 128
        type: id3v1_tag
    

    Note that while current stable is v0.3, you'll have to download and build the compiler + runtimes from the Github and use the latest ones.