pythonpandasdataframestemming

How to apply stemming to a column in a pandas dataframe


If i had the following dataframe:

import pandas as pd

d = {'col1': ['goodness', 'beautiful'], 'col2': [3, 4]}
df = pd.DataFrame(data=d)

Output
        col1  col2
0   goodness     3
1  beautiful     4

I am using the porter stemmer:

print(porter.stem('goodness'))
print(porter.stem('beautiful'))

Output
good
beauti

How can I apply this stem function to all elements of col1 from the original dataframe?

I have tried the following but no luck since it requires an input of word

df['col1'].apply(porter.stem(word), arg= word for word in df['col1'])

The desired output is:

        col1  col2
0       good     3
1       beauti     4

Solution

  • df['col1'] = df['col1'].apply(porter.stem)
    

    should do the job.