pythonlistdictionaryempty-list

Removing empty dictionary elements in Python


My program would scrape some website and create two lists, one for category, the other for content. I then use dict(zip(......)) command to match them and put them into a dict.

Something like this:

complete_dict=dict(zip(category_list,info_list))

I run into the problem that my program is reading empty element in both lists (category, info). It's fine as long as I can remove them later. The problem is, I failed to find a way to do so. When reading out, both lists contain empty elements, not empty strings, but more like an empty list within a list. I try to remove them both in lists and in dictionary after zipping them, using commands like:

category_list=filter(None, category_list)

or:

info_list=[x for x in info_list if x != []]

Of course operation is done for both lists.

None prevailed. I then tried doing it in the dictionary with:

dict((k, v) for k, v in complete_list.iteritems() if v)

What else can I try at this point?

Edit

I tried filtering, and either my conditions are not set correctly or it simply doesn't solve the problem. I'm looking for other way so it's not a duplicate of another thread (that thread has some useful info though).

Edit 2

What I'm getting right now is:

[u'info1', u'info2', u'info3', u'info4', ...]

[]

[]

[]

[]

[u'info1', u'info2', u'info3', u'info4', ...]

[]

[]

[]

[u'info1', u'info2', u'info3', u'info4', ...]

info 1, 2, 3, and 4 (and there are actually more elements) are content scraped from website, sorry I can't really reveal what those are, but the idea shows. This is one of the list (info_list), and I'm trying to remove all the []'s stuck in middle, so the result should be like

[u'info1', u'info2', u'info3', u'info4', ...]

[u'info1', u'info2', u'info3', u'info4', ...]

[u'info1', u'info2', u'info3', u'info4', ...]

and so on

Edit 3

My result looks like this after dict(zip(...))

{u'category1': u'info1', u'category2': u'info2', ...}

{}

{}

{u'category1': u'info1', u'category2': u'info2', ...}

{u'category1': u'info1', u'category2': u'info2', ...}

{}

{}

{}

and so on.


Solution

  • but more like an empty list within a list.

    Assuming this is guaranteed you can do

    # make sure value is not "[]" or "[[]]"
    {k: v for k, v in complete_list.iteritems() if v and v[0]}
    

    Example:

    complete_list = {'x': [[]], 'y': [], 'z': [[1]]}
    {k: v for k, v in complete_list.iteritems() if v and v[0]}
    # returns {'z': [[1]]}
    

    EDIT

    From your updated question, I see you are zipping lists together after scraping from a website like so:

    complete_dict=dict(zip(category_list,info_list))
    

    It looks like your info_list is empty in some cases, just do

    if info_list:
        complete_dict=dict(zip(category_list,info_list))
    

    to ensure you don't zip category_list with an empty list.