Struggling with how to do this in a pythonic way. I have a list of list which we can call names
[('Jimmy', 'Smith'), ('James', 'Wilson'), ('Hugh' "Laurie')]
And then I have a two variables
First_name = 'Jimm'
Last_name = 'Smitn'
I want to iterate through this list of list, of first and last names to fuzzy match these values and return the list that is the closest to the specified First_name and Last_name
You can implement fuzzy matching obtaining best match ratio (using max()
) returned by difflib.SequenceMatcher()
.
To implement this we should pass lambda
as key
argument which will return match ratio. In my example I'd use SequenceMatcher.ratio()
, but if performance is important you should also try with SequenceMatcher.quick_ratio()
and SequenceMatcher.real_quick_ratio()
.
from difflib import SequenceMatcher
lst = [('Jimmy', 'Smith'), ('James', 'Wilson'), ('Hugh', 'Laurie')]
first_name = 'Jimm'
last_name = 'Smitn'
matcher = SequenceMatcher(a=first_name + ' ' + last_name)
match_first_name, match_last_name = max(lst,
key=lambda x: matcher.set_seq2(' '.join(x)) or matcher.ratio())
print(first_name, last_name, '-', match_first_name, match_last_name)