To give some context, I am parsing a DICOM file, and having difficulties using the Transfer Syntax entry to determine whether to use implicit or explicit parsing. But let me define a simplified syntax, so no dicom knowledge is required.
We have a sequence of entries, each entry has a group number
, and a data
part. The group number is always represented as u2
, but the data can be of different types, lets say u2
or u4
. The order of the entries can be arbitrary, except that all the entries with group number == 2
must be at the top. All entries with group number == 2
has a data type of u2
, but subsequent data parts can differ.
And here comes the hard part: all the items with group number != 2
have data type u4
if and only if an entry exactly like this was present previously:
(group, data) == (0x0002, 0x0101)
In python for example, I would parse it like this:
def read_entries(stream):
is_u4 = False
while not stream.eos():
group = stream.read_u2()
if group != 2 and is_u4:
data = stream.read_u4()
else:
data = stream.read_u2()
if group == 2 and data == 0x0101:
is_u4 = True
yield (group, data)
Is there a way to achieve this using kaitai-struct?
Exact transcription of Python code to KS per se is impossible right now, only by plugging in the code written in imperative language.
However, thanks to your additional info, an alternative approach is available, see below for solution.
Kaitai Struct emphasizes stateless parsing, thus everything we parse is effectively immutable (read-only), i.e. there are variables that can change its value during the parsing. Thus propagating is_u4
between parse cycles is non-trivial thing to do. We have similar problem with MIDI running status, for example.
Some of the solutions suggested by people sometimes is recursive type definition + propagation of instances using _parent
syntax (see issue #70), but:
Alternative approach, given additional information you've provided might be viable. Actually, it turns out that whole stream of elements can be grouped in 3 groups:
Thus it possible to get away with:
types:
elements:
seq:
- id: elements_2
type: element_2
repeat-until: _.is_u4
- id: elements_4
type: element_4
repeat: eos
element_2:
seq:
- id: group
type: u2
- id: data
type: u2
instances:
is_u4:
value: group == 2 and data == 0x0101
element_4:
seq:
- id: group
type: u2
- id: data
type: u4
Please tell me if I got it right and that will work for your project?