pipelinestemming

I need to perform a stemming operation in Python, without nltk . Using pipeline methods


I have a list of words and a list of stem rules.

I need to stem the words that their suffixes are in the stem rules list.I got a hint from a friend that i can use pipeline methods

For example if I have :

stem = [ 'less', 'ship', 'ing', 'les', 'ly', 'es', 's' ]
text = [ 'friends', 'friendly', 'keeping', 'friendship' ]

I should get:

'friend', 'friend', 'keep', 'friend'

Solution

  • rules = {'ness': '', 'ational': 'ate', 'ing': '', 'sses': 'ss'}

    def stemx(inp:str): for x in rules: if inp[len(inp) - len(x):] == x: return inp[0:len(inp) - len(x)] + rules[x] return inp

    print(stemx('singfds'))