I am building a data mining app to collect tweets using the Twitter streaming API (via tweepy) and run a suite of NLP algorithms on it. So far all I have been able to do is get the tweets to be written into an external file. Due to the volume of tweets I am going to collect is a 100 at a time (pretty small) and deployment concerns, I wish to collect these tweets to a dictionary or list for further analysis. However, I have failed in doing this. The code I have so far is given below:
import tweepy
class MyStreamListener(tweepy.StreamListener):
def __init__(self, api=None):
super(MyStreamListener, self).__init__()
self.num_tweets = 0
self.tweets = []
def on_status(self, status):
#print(status.text)
self.num_tweets += 1
self.tweets.append(status.text)
if self.num_tweets > 100:
return False
def getstreams(keyword):
CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_TOKEN = ''
ACCESS_SECRET = ''
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
api = tweepy.API(auth, wait_on_rate_limit=True)
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth,listener=myStreamListener)
tweet_list = myStream.filter(track=[keyword])
return tweet_list.tweets
getstreams('Starbucks')
However when I run this, all I get is:
AttributeError: 'NoneType' object has no attribute 'tweets'
pointing to the line:
return tweet_list.tweets
I'd be grateful if anyone could answer how to overcome this issue and shed insight on how to collect n number of tweets into a list.
You can use the on_data function in your class.
def on_data(self, data):
# Converting data , which is an object, into JSON
tweet = json.loads(data)
# my_tweet is our list declared globally
my_tweet.append(tweet)