pythonpython-3.xlistsimilarity

How can I calculate the Jaccard Similarity of two lists containing strings in Python?


I have two lists with usernames and I want to calculate the Jaccard similarity. Is it possible?

This thread shows how to calculate the Jaccard Similarity between two strings, however I want to apply this to two lists, where each element is one word (e.g., a username).


Solution

  • I ended up writing my own solution after all:

    def jaccard_similarity(list1, list2):
        intersection = len(list(set(list1).intersection(list2)))
        union = (len(set(list1)) + len(set(list2))) - intersection
        return float(intersection) / union