Given a word, which may or may not be a singular-form noun, how would you generate its plural form?
Based on this NLTK tutorial and this informal list on pluralization rules, I wrote this simple function:
def plural(word):
"""
Converts a word to its plural form.
"""
if word in c.PLURALE_TANTUMS:
# defective nouns, fish, deer, etc
return word
elif word in c.IRREGULAR_NOUNS:
# foot->feet, person->people, etc
return c.IRREGULAR_NOUNS[word]
elif word.endswith('fe'):
# wolf -> wolves
return word[:-2] + 'ves'
elif word.endswith('f'):
# knife -> knives
return word[:-1] + 'ves'
elif word.endswith('o'):
# potato -> potatoes
return word + 'es'
elif word.endswith('us'):
# cactus -> cacti
return word[:-2] + 'i'
elif word.endswith('on'):
# criterion -> criteria
return word[:-2] + 'a'
elif word.endswith('y'):
# community -> communities
return word[:-1] + 'ies'
elif word[-1] in 'sx' or word[-2:] in ['sh', 'ch']:
return word + 'es'
elif word.endswith('an'):
return word[:-2] + 'en'
else:
return word + 's'
But I think this is incomplete. Is there a better way to do this?
The pattern-en package offers pluralization
>>> import pattern.text.en
>>> pattern.text.en.pluralize("dog")
'dogs'
Note also that in order to run the import above successfully, you may have to first execute the following (at least the first time):
>>> import nltk
>>> nltk.download('omw-1.4')