I am trying to use the ffi-aspell gem to spell check a text. In order to do that, it seems that I have to extract the words by myself. I am trying to do that by applying String#scan
to the text with a regex, but it does not seem straightforward.
What is the easiest way to define the class of characters that may appear in an ffi-aspell dictionary of some language? I want to make it available not only for English, so things like /[a-zA-Z']/
for the character (or /[a-zA-Z']+/
the word) does not work. /[[:word:]]/
seems to capture characters that are not in the dictionary, such as numerals, and further does not match the apostrophe (single quote), which is frequently used in a word. Is there some documentation that defines the character set used in an ffi-aspell dictionary?
I guess it would be easier to scan ffi_aspell
dictionary first for entries and just kinda Regexp#union
uniques afterwards.