pythonstringnlpfuzzy-search

Python fuzzy search and replace


I need to perfom fuzzy search for sub-string in string and replace that part. For example:

str_a = "Alabama"
str_b = "REPLACED"
orig_str = "Flabama is a state located in the southeastern region of the United States."
print(fuzzy_replace(str_a, str_b, orig_str)) # fuzzy_replace code should be implemented
# Output: REPLACED is a state located in the southeastern region of the United States.

The search itself is simple with fuzzywuzzy module, but it gives me only ratio of difference between strings. Are there any ways to find a position in original string where sub-string fuzzy matches to?


Solution

  • Try this..

    from fuzzywuzzy import fuzz
    
    def fuzzy_replace(str_a, str_b, orig_str):
        l = len(str_a.split()) # Length to read orig_str chunk by chunk
        splitted = orig_str.split()
        for i in range(len(splitted)-l+1):
            test = " ".join(splitted[i:i+l])
            if fuzz.ratio(str_a, test) > 75: #Using fuzzwuzzy library to test ratio
                before = " ".join(splitted[:i])
                after = " ".join(splitted[i+1:])
                return before+" "+str_b+" "+after #Output will be sandwich of these three strings
    
    str_a = "Alabama is a"
    str_b = "REPLACED"
    orig_str = "Flabama is a state located in the southeastern region of the United States."
    print fuzzy_replace(str_a, str_b, orig_str)
    

    This prints

     REPLACED state located in the southeastern region of the United States.