I am needing to take a highly nested json file (i.e. Elasticsearch mapping for an index) and produce a list of items.
Example Elasticsearch Mapping:
{
"mappings": {
"properties": {
"class": {
"properties": {
"name": {
"properties": {
"firstname": {
"type": "text"
},
"lastname": {
"type": "text"
}
}
},
"age": {
"type": "text "
}
}
}
}
}
}
Example Desired Result:
["mappings.properties.class.properties.name.properties.firstname",
"mappings.properties.class.properties.name.properties.lastname",
"mappings.properties.class.properties.age"]
I pandas.json_normalize() doesn't quite do what I want. Neither does glom()
You should be able to make a fairly short recursive generator to do this. I'm assuming you want all the keys until you see a dict with type
in it:
d = {
"mappings": {
"properties": {
"class": {
"properties": {
"name": {
"properties": {
"firstname": {
"type": "text"
},
"lastname": {
"type": "text"
}
}
},
"age": {
"type": "text "
}
}
}
}
}
}
def all_keys(d, path=None):
if path is None:
path = []
if not isinstance(d, dict) or 'type' in d:
yield '.'.join(path)
return
for k, v in d.items():
yield from all_keys(v, path + [k])
list(all_keys(d))
Which gives:
['mappings.properties.class.properties.name.properties.firstname',
'mappings.properties.class.properties.name.properties.lastname',
'mappings.properties.class.properties.age']