I have a stream of links coming in, and I want to check them for rss
every now and then. But when I fire off my get_rss()
function, it blocks and the stream halts. This is unnecessary, and I'd like to just fire-and-forget about the get_rss()
function (it stores its results elsewhere.)
My code is like thus:
self.ff.get_rss(url) # not async
print 'im back!'
(...)
def get_rss(url):
page = urllib2.urlopen(url) # not async
soup = BeautifulSoup(page)
I'm thinking that if I can fire-and-forget the first call, then I can even use urllib2 wihtout worrying about it not being async. Any help is much appreciated!
Edit: Trying out gevent, but like this nothing happens:
print 'go'
g = Greenlet.spawn(self.ff.do_url, url)
print g
print 'back'
# output:
go
<Greenlet at 0x7f760c0750f0: <bound method FeedFinder.do_url of <rss.FeedFinder object at 0x2415450>>(u'http://nyti.ms/SuVBCl')>
back
The Greenlet seem to be registered, but the function self.ff.do_url(url)
doesn't seem to be run at all. What am I doing wrong?
You want to use the threading module or the multiprocessing module and save the result either in database, a file or a queue.
You also can use gevent.