Thanks for looking into this, I have a python program for which I need to have process_tweet
and build_freqs
for some NLP task, nltk
is installed already and utils
wasn't so I installed it via pip install utils
but the above mentioned two modules apparently weren't installed, the error I got is standard one here,
ImportError: cannot import name 'process_tweet' from
'utils' (C:\Python\lib\site-packages\utils\__init__.py)
what have I done wrong or is there anything missing? Also I referred This stackoverflow answer but it didn't help.
You can easily access any source code with ??, for example in this case: process_tweet?? (the code above from deeplearning.ai NLP course custome utils library):
def process_tweet(tweet):
"""Process tweet function.
Input:
tweet: a string containing a tweet
Output:
tweets_clean: a list of words containing the processed tweet
"""
stemmer = PorterStemmer()
stopwords_english = stopwords.words('english')
# remove stock market tickers like $GE
tweet = re.sub(r'\$\w*', '', tweet)
# remove old style retweet text "RT"
tweet = re.sub(r'^RT[\s]+', '', tweet)
# remove hyperlinks
tweet = re.sub(r'https?:\/\/.*[\r\n]*', '', tweet)
# remove hashtags
# only removing the hash # sign from the word
tweet = re.sub(r'#', '', tweet)
# tokenize tweets
tokenizer = TweetTokenizer(preserve_case=False, strip_handles=True,
reduce_len=True)
tweet_tokens = tokenizer.tokenize(tweet)
tweets_clean = []
for word in tweet_tokens:
if (word not in stopwords_english and # remove stopwords
word not in string.punctuation): # remove punctuation
# tweets_clean.append(word)
stem_word = stemmer.stem(word) # stemming word
tweets_clean.append(stem_word)