pythonnlpwordnetlinguistics

Generating the plural form of a noun


Given a word, which may or may not be a singular-form noun, how would you generate its plural form?

Based on this NLTK tutorial and this informal list on pluralization rules, I wrote this simple function:

def plural(word):
    """
    Converts a word to its plural form.
    """
    if word in c.PLURALE_TANTUMS:
        # defective nouns, fish, deer, etc
        return word
    elif word in c.IRREGULAR_NOUNS:
        # foot->feet, person->people, etc
        return c.IRREGULAR_NOUNS[word]
    elif word.endswith('fe'):
        # wolf -> wolves
        return word[:-2] + 'ves'
    elif word.endswith('f'):
        # knife -> knives
        return word[:-1] + 'ves'
    elif word.endswith('o'):
        # potato -> potatoes
        return word + 'es'
    elif word.endswith('us'):
        # cactus -> cacti
        return word[:-2] + 'i'
    elif word.endswith('on'):
        # criterion -> criteria
        return word[:-2] + 'a'
    elif word.endswith('y'):
        # community -> communities
        return word[:-1] + 'ies'
    elif word[-1] in 'sx' or word[-2:] in ['sh', 'ch']:
        return word + 'es'
    elif word.endswith('an'):
        return word[:-2] + 'en'
    else:
        return word + 's'

But I think this is incomplete. Is there a better way to do this?


Solution

  • The pattern-en package offers pluralization

    >>> import pattern.text.en
    >>> pattern.text.en.pluralize("dog")
    'dogs'
    

    Note also that in order to run the import above successfully, you may have to first execute the following (at least the first time):

    >>> import nltk
    >>> nltk.download('omw-1.4')