pythonstringjsongreppprint

Parsing json and searching through it


I have this code

import json
from pprint import pprint
json_data=open('bookmarks.json')
jdata = json.load(json_data)
pprint (jdata)
json_data.close()

How can I search through it for u'uri': u'http:?


Solution

  • As json.loads simply returns a dict, you can use the operators that apply to dicts:

    >>> jdata = json.load('{"uri": "http:", "foo", "bar"}')
    >>> 'uri' in jdata       # Check if 'uri' is in jdata's keys
    True
    >>> jdata['uri']         # Will return the value belonging to the key 'uri'
    u'http:'
    

    Edit: to give an idea regarding how to loop through the data, consider the following example:

    >>> import json
    >>> jdata = json.loads(open ('bookmarks.json').read())
    >>> for c in jdata['children'][0]['children']:
    ...     print 'Title: {}, URI: {}'.format(c.get('title', 'No title'),
                                              c.get('uri', 'No uri'))
    ...
    Title: Recently Bookmarked, URI: place:folder=BOOKMARKS_MENU(...)
    Title: Recent Tags, URI: place:sort=14&type=6&maxResults=10&queryType=1
    Title: , URI: No uri
    Title: Mozilla Firefox, URI: No uri
    

    Inspecting the jdata data structure will allow you to navigate it as you wish. The pprint call you already have is a good starting point for this.

    Edit2: Another attempt. This gets the file you mentioned in a list of dictionaries. With this, I think you should be able to adapt it to your needs.

    >>> def build_structure(data, d=[]):
    ...     if 'children' in data:
    ...         for c in data['children']:
    ...             d.append({'title': c.get('title', 'No title'),
    ...                                      'uri': c.get('uri', None)})
    ...             build_structure(c, d)
    ...     return d
    ...
    >>> pprint.pprint(build_structure(jdata))
    [{'title': u'Bookmarks Menu', 'uri': None},
     {'title': u'Recently Bookmarked',
      'uri':   u'place:folder=BOOKMARKS_MENU&folder=UNFILED_BOOKMARKS&(...)'},
     {'title': u'Recent Tags',
      'uri':   u'place:sort=14&type=6&maxResults=10&queryType=1'},
     {'title': u'', 'uri': None},
     {'title': u'Mozilla Firefox', 'uri': None},
     {'title': u'Help and Tutorials',
      'uri':   u'http://www.mozilla.com/en-US/firefox/help/'},
     (...)
    }]
    

    To then "search through it for u'uri': u'http:'", do something like this:

    for c in build_structure(jdata):
        if c['uri'].startswith('http:'):
            print 'Started with http'