According to my research there are two easy ways to remove duplicates from a list:
a = list(dict.fromkeys(a))
and
a = list(set(a))
Is one of them more efficient than the other?
Definitely the second one is more efficient as sets are more or less created for that purpose and you skip the overhead related to creation of dict which is way heavier. Perfomance-wise it definitely depends on what the payload actually is.
import timeit
import random
input_data = [random.choice(range(100)) for i in range(1000)]
from_keys = timeit.timeit('list(dict.fromkeys(input_data))', number=10000, globals={'input_data': input_data})
from_set = timeit.timeit('list(set(input_data))', number=10000, globals={'input_data': input_data})
print(f"From keys performance: {from_keys:.3f}")
print(f"From set performance: {from_set:.3f}")
Prints:
From keys performance: 0.230
From set performance: 0.140
It doesn't really mean it's almost twice as fast. The difference is barely visible. Try it for yourself with different random data.