pythonregexparsingfile-formatparsimonious

python parsing: what file format uses `=>` OR how to read custom input files to dict


When using the zmdp solver from here i came across a funky file format that I haven't seen before, it uses => for assignment. I wasn't able to find out what format it was from the package documentation (it says it is a "policy" format, but it must be based on something more generic)

{
  policyType => "MaxPlanesLowerBound",
  numPlanes => 7,
  planes => [
    {
      action => 2,
      numEntries => 3,
      entries => [
        0, 18.7429,
        1, 18.7426,
        2, 21.743
      ]
    },
    ### more entries ###
    {
      action => 3,
      numEntries => 3,
      entries => [
        0, 20.8262,
        1, 20.8261,
        2, 20.8259
      ]
    }
  ]
}

I researched a lot on what would be a straightforward way to parse such files (in Python), and also read this blog post which has a huge variety of options for lexing and parsing (the tools that looked most promising for my example seemed to be parsimonious and parsy).
However, whatever solutions I can think of just feels like I'm re-inventing the wheel, and lexing and parsing seems to be an overkill for what I'm trying to do.
I also found this stackoverflow question which coincidentally seems to also be related to a format that uses =>. However, being lazy and minimalistic when it comes to code, I don't like the regex solution too much. My gut feeling tells me that there must be a 3-4 line solution to write the input file to a python dict or similarly useful format. In particular, I suspect that this is already standard syntax of some format I just am not aware of (it's obviously not csv, json, yaml or xml)

The question therefore is: Is the above a standard file format, and if yes, what is it?
If not, how do I parse this file elegantly and compactly in Python3, i.e. without regexing for every keyword?


Solution

  • I don’t see any differences from json here aside from replacing ‘=>’ with ‘:’ and adding a top level key.

    filestr.replace(‘=>’, ‘:’)
    dictionary = json.loads(filestr)
    

    Edited after seeing comment above.

    Unquoted keys are indeed not part of the json standard. To address that, you can use a library as described here or you can regex it.