Here is my json file. I want to load the data list from it, one by one, and only it. And then, for example plot it...
This is an example, because I am dealing with large data set, with which I could not load all the file (that would create a memory error).
{
"earth": {
"europe": [
{"name": "Paris", "type": "city"},
{"name": "Thames", "type": "river"},
{"par": 2, "data": [1,7,4,7,5,7,7,6]},
{"par": 2, "data": [1,0,4,1,5,1,1,1]},
{"par": 2, "data": [1,0,0,0,5,0,0,0]}
],
"america": [
{"name": "Texas", "type": "state"}
]
}
}
Here is what I tried:
import ijson
filename = "testfile.json"
f = open(filename)
mylist = ijson.items(f, 'earth.europe[2].data.item')
print mylist
It returns me nothing, even when I try to convert it into a list:
[]
You need to specify a valid prefix; ijson prefixes are either keys in a dictionary or the word item
for list entries. You can't select a specific list item (so [2]
doesn't work).
If you wanted all the data
keys dictionaries in the europe
list, then the prefix is:
earth.europe.item.data
# ^ ------------------- outermost key must be 'earth'
# ^ ------------- next key must be 'europe'
# ^ ------ any value in the array
# ^ the value for the 'data' key
This produces each such list:
>>> l = ijson.items(f, 'earth.europe.item.data')
>>> for data in l:
... print data
...
[1, 7, 4, 7, 5, 7, 7, 6]
[1, 0, 4, 1, 5, 1, 1, 1]
[1, 0, 0, 0, 5, 0, 0, 0]
You can't put wildcards in that, so you can't get earth.*.item.data
for example.
If you need to do more complex prefixing matching, you'd have to use the ijson.parse()
function and handle the events this produces. You can reuse the ijson.ObjectBuilder()
class to turn events you are interested in into Python objects:
parser = ijson.parse(f)
for prefix, event, value in parser:
if event != 'start_array':
continue
if prefix.startswith('earth.') and prefix.endswith('.item.data'):
continent = prefix.split('.', 2)[1]
builder = ijson.ObjectBuilder()
builder.event(event, value)
for nprefix, event, value in parser:
if (nprefix, event) == (prefix, 'end_array'):
break
builder.event(event, value)
data = builder.value
print continent, data
This will print every array that's in a list under a 'data'
key (so lives under a prefix that ends with '.item.data'
), with the 'earth'
key. It also extracts the continent key.