pythonspell-checkinghunspell

How to get more word suggestions from Hunspell with pyhunspell


I'm using hunspell with the pyhunspell wrapper. I'm calling:

hunspell.suggest("Yokk")

But this is returning only ["Yolk", "Yoke"]. I saw that "York" is in the dictionary but is not being returned. Is there a way to return more than 2 suggestions, either by increasing the distance threshold or the number of top suggestions?

The text I'm trying to correct is "New York" and I have my own ranker that ranks the suggestions downstream. I just need more suggestions. I tried aspell and by default its returning 10 suggestions, one of which is in fact "York".

Note: The documentation doesn't mention any other arguments for method suggest. Even using the CLI I only get two suggestions:

hunspell -d en_US
Hunspell 1.7.2
yokk
& yokk 2 0: yolk, yoke

I've checked the default dictionaries are properly loaded using:

hunspell -D
SEARCH PATH:
...
AVAILABLE DICTIONARIES (path is not mandatory for -d option):
/Library/Spelling/en_US
LOADED DICTIONARY:
/Library/Spelling/en_US.aff
/Library/Spelling/en_US.dic
āžœ  2 subl /Library/Spelling/en_US.dic

And I've also checked that the expected "York" is in the dictionary:

cat /Library/Spelling/en_US.dic | grep York
York/M

I wonder if there is some other configuration I can set somewhere, I can't see anything evident in either the wrapper or the CLI documentation: https://github.com/pyhunspell/pyhunspell/wiki/Documentation https://github.com/hunspell/hunspell


Solution

  • I have installed 2020 dictionaries from this link:
    http://wordlist.aspell.net/dicts/
    the newest dictionaries available here:
    https://github.com/LibreOffice/dictionaries

    Then tested the code:

    import hunspell #0.5.5
    
    hobj = hunspell.HunSpell('./en_US.dic', './en_US.aff')
    tt = hobj.suggest("Yokk")
    print(tt)
    

    And got the output that is different from yours:

    ['York', 'Yoko', 'Yolk', 'Yoke']