I'm using hunspell with the pyhunspell wrapper. I'm calling:
hunspell.suggest("Yokk")
But this is returning only ["Yolk", "Yoke"]. I saw that "York" is in the dictionary but is not being returned. Is there a way to return more than 2 suggestions, either by increasing the distance threshold or the number of top suggestions?
The text I'm trying to correct is "New York" and I have my own ranker that ranks the suggestions downstream. I just need more suggestions. I tried aspell and by default its returning 10 suggestions, one of which is in fact "York".
Note:
The documentation doesn't mention any other arguments for method suggest
. Even using the CLI I only get two suggestions:
hunspell -d en_US
Hunspell 1.7.2
yokk
& yokk 2 0: yolk, yoke
I've checked the default dictionaries are properly loaded using:
hunspell -D
SEARCH PATH:
...
AVAILABLE DICTIONARIES (path is not mandatory for -d option):
/Library/Spelling/en_US
LOADED DICTIONARY:
/Library/Spelling/en_US.aff
/Library/Spelling/en_US.dic
ā 2 subl /Library/Spelling/en_US.dic
And I've also checked that the expected "York" is in the dictionary:
cat /Library/Spelling/en_US.dic | grep York
York/M
I wonder if there is some other configuration I can set somewhere, I can't see anything evident in either the wrapper or the CLI documentation: https://github.com/pyhunspell/pyhunspell/wiki/Documentation https://github.com/hunspell/hunspell
I have installed 2020 dictionaries from this link:
http://wordlist.aspell.net/dicts/
the newest dictionaries available here:
https://github.com/LibreOffice/dictionaries
Then tested the code:
import hunspell #0.5.5
hobj = hunspell.HunSpell('./en_US.dic', './en_US.aff')
tt = hobj.suggest("Yokk")
print(tt)
And got the output that is different from yours:
['York', 'Yoko', 'Yolk', 'Yoke']