pythonsyntaxnlpnltklemmatization

Python syntax error in list comprehension on string for Lemmatization


I'm trying to only perform Lemmatization on words in a string that have more than 4 letters. The desired output from the following code should be 'us american', but I received an invalid syntax error.

import nltk
from nltk.tokenize import TweetTokenizer
lemmatizer = nltk.stem.WordNetLemmatizer()
w_tokenizer = TweetTokenizer()    

wd = w_tokenizer.tokenize(('us americans'))
    [lemmatizer.lemmatize(w) for w in wd if len(w)>4 else wd for wd in w]

Solution

  • You could try with this list comprehension:

    [lemmatizer.lemmatize(w) if len(w)>4 else w for w in wd]
    

    Then, if you want a single string considering your input sample, you can use the Python join operation on strings:

    ' '.join([lemmatizer.lemmatize(w) if len(w)>4 else w for w in wd])