I try to ingest tweets with Twitter Streaming API.
Yesterday, after many tests, the Twitter API returned me an Error 420. I readed some topics and documentations and the problem is that I make to much connections in a short time.
from tweepy import Stream, API
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
# All API keys / access token
consumer_key = "something"
consumer_secret_key = "something"
access_token = "something"
access_token_secret = "something"
proxies = {
"http": "my_http_proxy",
"https": "my_https_proxy"
}
class Listener(StreamListener):
def on_status(self, status):
print("text : " + str(status))
def on_error(self, status):
if status == 420:
print("error : {}".format(str(status)))
return False
auth = OAuthHandler(consumer_key, consumer_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
listener = Listener()
twitterStream = Stream(api.auth, listener=listener, proxies=proxies)
try:
twitterStream.filter(track=['nasa'])
except Exception as e:
print("...end : {}".format(e))
twitterStream.disconnect()
twitterStream.disconnect()
I would like to understand :
Thanks a lot for responses.
Twitter's API returns the 420 HTTP status code
when an app is being rate limited for making too many requests.
See https://developer.twitter.com/en/docs/basics/response-codes.
Specifically, for streaming endpoints:
Back off exponentially for HTTP 420 errors. Start with a 1 minute wait and double each attempt. Note that every HTTP 420 received increases the time you must wait until rate limiting will no longer will be in effect for your account.
Clients which do not implement backoff and attempt to reconnect as often as possible will have their connections rate limited for a small number of minutes. Rate limited clients will receive HTTP 420 responses for all connection requests.
Clients which break a connection and then reconnect frequently (to change query parameters, for example) run the risk of being rate limited.
Twitter does not make public the number of connection attempts which will cause a rate limiting to occur, but there is some tolerance for testing and development. A few dozen connection attempts from time to time will not trigger a limit. However, it is essential to stop further connection attempts for a few minutes if a HTTP 420 response is received. If your client is rate limited frequently, it is possible that your IP will be blocked from accessing Twitter for an indeterminate period of time.
See https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/connecting.