pythondictionaryinverted-index

How to create an inverted index given a list of tuples?


For exercising reasons, I have implemented the following function inverted_idx(data) that creates an inverted index (starting from a list of tuples) in which the keys of the dictionary are the distinct elements in the list and the value associated with each key is a list of indexes of all the tuple having that key.

The function code is:

def inverted_idx(data):
    rows = []
    dictionary = {}
    for idx, x in enumerate(data):
        rows.append((idx, x))
    for idx, x in rows:
        for key in x:
            if key in dictionary:
                dictionary[key].append(idx)
            else:
                dictionary[key] = [idx]
    return dictionary

By using it on a list of tuples:

A = [(10, 4, 53), (0, 3, 10), (12, 6, 2), (8, 4, 0)(12, 3, 9)]
inverted_idx (data = A)

Result:

{10: [0, 1],
 4: [0, 3],
 53: [0],
 0: [1, 3],
 3: [1, 4],
 12: [2, 4],
 6: [2],
 2: [2],
 8: [3],
 9: [4]}

The function works properly, now what I want to do is to modify the function in order that the inverted index is created just for those elements of the tuples which occupy the specific position. Let's say that I want to get create an inverted index just for the element in position 1.

The desired output would be:

{4: [0, 3]
3: [1, 4]
6: [2]}

How could I change the code in order to create the inverted index just for the element in a given postion?

I have tried to do like this:

def inverted_idx(data):
    rows = []
    dictionary = {}
    for idx, x in enumerate(data):
        rows.append((idx, x))
    for idx, x[1] in rows: # trying to access the element in position 1
        for key in x:
            if key in dictionary:
                dictionary[key].append(idx)
            else:
                dictionary[key] = [idx]
    return dictionary

But of course, I got the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-79-10c9adaea533> in <module>
      1 A = [(10, 4, 53), (0, 3, 10), (12, 6, 2), (8, 4, 0), (12, 3, 9)]
      2
----> 3 inverted_idx(data = A)

<ipython-input-78-d3f320303057> in inverted_idx(data)
      4     for idx, x in enumerate(data):
      5         rows.append((idx, x))
----> 6     for idx, x[1] in rows:
      7         for key in x:
      8             if key in dictionary:

TypeError: 'tuple' object does not support item assignment

Solution

  • I would say the solution from Andrej Kesely is a shorter version, I would still like to submit my version that is in your style:

    def inverted_idx(data):
        rows = []
        dictionary = {}
        for idx, x in enumerate(data):
            for index, key in enumerate(x):
                if index != 1:
                    continue
                if key in dictionary:
                    dictionary[key].append(idx)
                else:
                    dictionary[key] = [idx]
            rows.append((idx, x))
        return dictionary
    

    Returns the following:

    {3: [1, 4], 4: [0, 3], 6: [2]}
    

    Hope this clears it up. You need to add the index to enumerate the data.