pythonset

How does Python's set difference work internally?


Recently, I have been looking through some Python modules to understand their behavior and how optimized their implementations are. Can anyone tell what algorithm Python uses to perform the set difference operations? One possible way to achieve set difference is by using hash tables which will involve an extra N space. I tried to find the source code of the set operations but I am not able to find out the code location. Please help.


Solution

  • A set in python is a hash itself. So implementing difference for it is not as hard as you imagine. Looking from a higher level how does one implement set difference? Iterate over one of the collections and add to the result all elements that are not present in the other sequence.