djangomatchingstring-matchingtextmatchingsequencematcher

How to return the most match value via SequenceMatcher


I have to match a product's category name returned from API response and product's category name from data base.

For example: api_category = "packing tape",

category names from DB = ["packing material", "packaging equipment"]

from difflib import SequenceMatcher

for e in Category.objects.all():
    matching_category = SequenceMatcher(None, api_category, e.name).quick_ratio()
0.36363636363636365
0.4090909090909091

I get floats, but I want to get the most matching element (e)


Solution

  • I could solve the problem with using FuzzyWuzzy lib.

    from fuzzywuzzy import process
    
    def to_internal_value(self, data):
        internal_data = super().to_internal_value(data)
        api_category = process.extractOne(
                data.get("category"), Category.objects.values_list("name",flat=True))
        if api_category:
            category = Category.objects.filter(name__icontains=api_category[0]
                        ).first()
            if category:
                    internal_data["category"] = category
        return internal_data```