pythonlistnestedduplicates

How to find the short list from a long list if the elements in the short one are all in the long one without changing the sequence using Python?


I have a nested list:

a=[[1,2,3],[0,1,2,3,4],[5,7,9],[5,6,7,8,9],[7,10],[10,7,2,1]]

Now, I want to delete [1,2,3] because it is included in [0,1,2,3,4]. And I need to keep [5,7,9] because although it is included in [5,6,7,8,9] but there are some numbers in between. Also, [7,10] should be kept too because these numbers are in the wrong order in [10,7,2,1]. So, the simplified list is:

a=[[0,1,2,3,4],[5,7,9],[5,6,7,8,9],[7,10],[10,7,2,1]]

So, I need to delete the short sub lists if they are included in the long ones without any changes. How can I do that?


Solution

  • You can loop backwards over the indexes of a in twos, starting at the second-last sublist, checking whether the list is included in the next one.

    To test whether it is included, you have to test all the possible start positions in the second list, and then check each pair of items to see if they are the same.

    a = [[1,2,3], [0,1,2,3,4], [5,7,9], [5,6,7,8,9], [7,10], [10,7,2,1]]
    
    def included_in(list1, list2):
        "test if list1 is included in list2"
        r = range(len(list1))
        for start_pos in range(len(list2) - len(list1)):
            for offset in r:
                if list2[start_pos + offset] != list1[offset]:
                    break
            else:
                # we got through all the positions in list1 without finding 
                # corresponding element of list2 that *didn't* match
                return True
        # we got through all the starting positions in list2 without finding 
        # a complete match
        return False
                
    
    for index in range(len(a)-2, -1, -2):
        list1 = a[index]
        list2 = a[index + 1]
        if included_in(list1, list2):
            a.pop(index)
    
    print(a)
    

    This gives this output:

    [[0, 1, 2, 3, 4], [5, 7, 9], [5, 6, 7, 8, 9], [7, 10], [10, 7, 2, 1]]