pythonduplicatesperiodicity

find second duplicate and period in a python list


I've a python list as this one [2, 5, 26, 37, 45, 12, 23, 37, 45, 12, 23, 37, 45, 12, 23, 37]. The real list is really long. The list repeat itself after a certain point in this case after 37. I have no problem finding the number at which it repeats, but i need to truncate the list at the second one. In this case the result would be [2, 5, 26, 37, 45, 12, 23, 37]. For finding the number (37 in this case) i use a function firstDuplicate() found on stackoverflow. Someone can help me ?

def firstDuplicate(a):
aset = set()
for i in a:
    if i in aset:
        return i
    else:
        aset.add(i)
        pass
    pass
pass
LIST = LIST[1:firstDuplicate(LIST)]

Solution

  • You can use the same basic idea of firstDuplicate() and create a generator that yields values until the dupe is found. Then pass it to list(), a loop, etc.

    l = [2, 5, 26, 37, 45, 12, 23, 37, 45, 12, 23, 37, 45, 12, 23, 37]
    
    def partitionAtDupe(l):
        seen = set()
        for n in l:
            yield n
            if n in seen:    
                break
            seen.add(n)
    
    
    list(partitionAtDupe(l))
    # [2, 5, 26, 37, 45, 12, 23, 37]
    

    It's not clear what should happen if there are no dupes. The code above will yield the whole list in that case.