pythonsubsequence

Subsequence match for a simple list and a list of sets?


This is a problem involving subsequences. The ordinary subsequence problem has a list l1 and asks whether another list l2 is a subsequence. For example:

l1 = [3, 1, 4, 1, 5, 9]

l2 = [3, 1, 5]            => True (remove 4, 1, 9)

l2 = [4, 3, 5]            => False

In my problem, l2 instead has sets of values. For example:

l1 = [3, 1, 4, 1, 5, 9]
l2 = [{2, 3}, {1, 5, 2}, {9}]
=> True, because:
     {2, 3} matches 3
     {1, 5, 2} matches 1 (or 5)
     {9} matches 9

Here, l2 can be thought of conceptually expanded to different possibilities by taking one element from each set:

l2exp = [[2, 1, 9], [2, 5, 9], [2, 2, 9], [3, 1, 9], [3, 5, 9], [3, 2, 9]]

which means as long as one of those six possible lists represented by l2 matches l1, we have a successful match. Since [3, 1, 9] matches l1 , the whole l2 matches.

So one solution may be first flatten the l2 into the new l2exp like above, and then for each sublist_l2 in l2exp, the ordinary subsequence check all(e in iter_of_l1 for e in sublist_l2) can be used.

How to do this match without explicitly expanding the l2 into a list of lists?


Solution

  • Witness Versions

    Instead of just knowing whether there's a match, one might want to also know what that match is. Maybe to truly use it, or maybe just to be able to verify the correctness. Here are a few versions, returning a list of the actual matched elements, or None if there's no match. The first two might be the best.

    Perhaps the simplest, assume each pool from l2 does have a match in l1, and catch the exception:

    def subseq1(l1, l2):
        it1 = iter(l1)
        try:
            return [next(x for x in it1 if x in pool)
                    for pool in l2]
        except StopIteration:
            pass
    

    Basic loops:

    def subseq2(l1, l2):
        witness = []
        it1 = iter(l1)
        for pool in l2:
            for x in it1:
                if x in pool:
                    witness.append(x)
                    break
            else:
                return
        return witness
    

    Slightly modifying my original if(all(any(... solution to append each witnessing element:

    def subseq3(l1, l2):
        witness = []
        it1 = iter(l1)
        if all(any(x in pool and not witness.append(x)
                   for x in it1)
               for pool in l2):
            return witness
    

    Pre-append a spot for the next witness element:

    def subseq4(l1, l2):
        witness = []
        it1 = iter(l1)
        if all(any(witness[-1] in pool
                   for witness[-1] in it1)
               for pool in l2
               if not witness.append(None)):
            return witness
    

    Pre-allocate spots for all witness elements:

    def subseq5(l1, l2):
        witness = [None] * len(l2)
        it1 = iter(l1)
        if all(any(witness[i] in pool
                   for witness[i] in it1)
               for i, pool in enumerate(l2)):
            return witness
    

    Check for false witnesses at the end:

    def subseq6(l1, l2):
        it1 = iter(l1)
        false = object()
        witness = [next((x for x in it1 if x in pool), false)
                   for pool in l2]
        if false not in witness:
            return witness
    

    Testing code (cases taken from Alain's answer):

    funcs = [
        subseq1,
        subseq2,
        subseq3,
        subseq4,
        subseq5,
        subseq6,
    ]
    
    for func in funcs:
        l1 = [1, 2, 3, 4]
        l2 = [{2, 1}, {1, 3}]
        print(func(l1,l2)) # True
    
        l1 = [1, 2, 3, 4]
        l2 = [{2, 1}, {1, 3}, {3, 4}]
        print(func(l1,l2)) # True
    
        l1 = [1, 2, 4, 3]
        l2 = [{2, 1}, {1, 3}, {3, 4}]
        print(func(l1,l2)) # False
    
        print()