pythonjsonjsoniq

json query that returns parent element and child data?


Given the following json:

{
    "README.rst": {
        "_status": {
            "md5": "952ee56fa6ce36c752117e79cc381df8"
        }
    },
    "docs/conf.py": {
        "_status": {
            "md5": "6e9c7d805a1d33f0719b14fe28554ab1"
        }
    }
}

is there a query language that can produce:

{
    "README.rst": "952ee56fa6ce36c752117e79cc381df8",
    "docs/conf.py": "6e9c7d805a1d33f0719b14fe28554ab1",
}

My best attempt so far with JMESPath (http://jmespath.org/) isn't very close:

>>> jmespath.search('*.*.md5[]', db)
['952ee56fa6ce36c752117e79cc381df8', '6e9c7d805a1d33f0719b14fe28554ab1']

I've gotten to the same point with ObjectPath (http://objectpath.org):

>>> t = Tree(db)
>>> list(t.execute('$..md5'))
['952ee56fa6ce36c752117e79cc381df8', '6e9c7d805a1d33f0719b14fe28554ab1']

I couldn't make any sense of JSONiq (do I really need to read a 105 page manual to do this?) This is my first time looking at json query languages..


Solution

  • Missed the python requirement, but if you are willing to call external program, this will still work. Please note, that jq >= 1.5 is required for this to work.

    # If single "key" $p[0] has multiple md5 keys, this will reduce the array to one key.
    cat /tmp/test.json | \
    jq-1.5 '[paths(has("md5")?) as $p | { ($p[0]): getpath($p)["md5"]}] | add '
    
    # this will not create single object, but you'll see all key, md5 combinations
    cat /tmp/test.json | \
    jq-1.5 '[paths(has("md5")?) as $p | { ($p[0]): getpath($p)["md5"]}] '
    

    Get paths with "md5"-key '?'=ignore errors (like testing scalar for key). From resulting paths ($p) filter and surround result with '{}' = object. And then those are in an array ([] surrounding the whole expression) which is then "added/merged" together |add

    https://stedolan.github.io/jq/