I'm trying to build a datastore on Google App Engine to collect some stream data off of StockTwits for a bunch of companies. I'm basically replicating one I did with Twitter, but it's giving me an HTTPException: Invalid and/or missing SSL certificate error for one of the URLs. I changed the URL to look at another company, but got the same result.
Here's the code that pulls the data:
class StreamHandler(webapp2.RequestHandler):
def get(self):
tickers = ['AAPL','GOOG', 'IBM', 'BAC', 'INTC',
'DELL', 'C', 'JPM', 'WFM', 'WMT',
'AMZN', 'HOT', 'SPG', 'SWY', 'HTSI',
'DUK', 'CEG', 'XOM', 'F', 'WFC',
'CSCO', 'UAL', 'LUV', 'DAL', 'COST', 'YUM',
'TLT', 'HYG', 'JNK', 'LQD', 'MSFT',
'GE', 'LVS', 'MGM', 'TWX', 'DIS', 'CMCSA',
'TWC', 'ORCL', 'WPO', 'NYT', 'GM', 'JCP',
'LNKD', 'OPEN', 'NFLX', 'SBUX', 'GMCR',
'SPLS', 'BBY', 'BBBY', 'YHOO', 'MAR',
'L', 'LOW', 'HD', 'HOV', 'TOL', 'NVR', 'RYL',
'GIS', 'K', 'POST', 'KRFT', 'CHK', 'GGP',
'RSE', 'RWT', 'AIG', 'CB', 'BRK.A', 'CAT']
for i in set(tickers):
urlst = 'https://api.stocktwits.com/api/2/streams/symbol/'
tickerstringst = urlst + i + '.json'
tickurlst = urllib2.Request(tickerstringst)
sttweets = urllib2.urlopen(tickurlst)
stcode = sttweets.getcode()
if stcode == 200:
stresults = json.load(sttweets, 'utf-8')
if "messages" in stresults:
stentries = stresults["messages"]
for stentry in stentries:
sttweet = streamdata()
stcreated = stentry['created_at']
sttweetid = str(stentry['id'])
sttweettxt = stentry['body']
sttweet.ticker = i
sttweet.created_at = stcreated
sttweet.tweet_id = sttweetid
sttweet.text = sttweettxt
sttweet.source = "StockTwits"
sttweet.put()
And here's the log file that shows the error. I'm running this on the local Python development server, btw:
WARNING 2012-12-06 23:20:12,993 dev_appserver.py:3655] Could not initialize images API; you are likely missing the Python "PIL" module. ImportError: No module named _imaging
INFO 2012-12-06 23:20:13,017 dev_appserver_multiprocess.py:655] Running application dev~jibdantestv2 on port 8088: http://localhost:8088
INFO 2012-12-06 23:20:13,017 dev_appserver_multiprocess.py:657] Admin console is available at: http://localhost:8088/_ah/admin
INFO 2012-12-06 23:20:54,776 dev_appserver.py:3092] "GET /_ah/admin HTTP/1.1" 302 -
INFO 2012-12-06 23:20:54,953 dev_appserver.py:3092] "GET /_ah/admin/datastore HTTP/1.1" 200 -
INFO 2012-12-06 23:20:55,280 dev_appserver.py:3092] "GET /_ah/admin/images/google.gif HTTP/1.1" 200 -
INFO 2012-12-06 23:21:04,617 dev_appserver.py:3092] "GET /_ah/admin/cron HTTP/1.1" 200 -
INFO 2012-12-06 23:21:04,815 dev_appserver.py:3092] "GET /_ah/admin/images/google.gif HTTP/1.1" 200 -
WARNING 2012-12-06 23:21:07,392 urlfetch_stub.py:448] Stripped prohibited headers from URLFetch request: ['Host']
ERROR 2012-12-06 23:21:09,921 webapp2.py:1553] Invalid and/or missing SSL certificate for URL: https://api.stocktwits.com/api/2/streams/symbol/GIS.json
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1536, in __call__
rv = self.handle_exception(request, response, e)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1530, in __call__
rv = self.router.dispatch(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 1102, in __call__
return handler.dispatch()
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2\webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "C:\Users\Tank\Documents\Aptana Studio 3 Workspace\jibdantestv2\main.py", line 38, in get
sttweets = urllib2.urlopen(tickurlst)
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 418, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "C:\Python27\lib\urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\dist27\httplib.py", line 502, in getresponse
raise HTTPException(str(e))
HTTPException: Invalid and/or missing SSL certificate for URL: https://api.stocktwits.com/api/2/streams/symbol/GIS.json
INFO 2012-12-06 23:21:09,937 dev_appserver.py:3092] "GET /add_data HTTP/1.1" 500 -
I don't know why GAE is having a problem with it, but I notice the certificate returned by api.stocktwits.com doesn't match the server name on its Subject's Common Name (which is ssl2361.cloudflare.com), but only on one of its Subject Alternative Names ("DNS Name=*.stocktwits.com"). Maybe Subject Alternatives Names are not supported, or don't work with wildcard names as used here. (This would be a Google bug / missing feature.)
I was able to reproduce your problem and find a workaround by calling GAE urlfetch.fetch API. (As you may know, on GAE urllib2 is implemented as a wrapper for urlfetch.)
Starting at the line with your urllib2.Request
up to your jason.load
, replace with:
sttweets = urlfetch.fetch(tickerstringst, validate_certificate=False)
stcode = sttweets.status_code
if stcode == 200:
stresults = json.loads(sttweets.content, 'utf-8')
And your error goes away, along with any assurance you are actually taking with the real site (though the traffic should still be encrypted).
Currently the urlfetch.fetch
GAE API docs say:
validate_certificate
The underlying implementation currently defaults to False, but will default to True in the near future.
Well, welcome to the future because validate_certificate now seems to be defaulting to True
.
This is probably a bug (or missing feature, if you want to be kind) in GAE urlfetch.fetch and I encourage you to report it to Google as such.