I'm using the cookielib
module to handle HTTP cookies when using the urllib2
module in Python 2.6 in a way similar to this snippet:
import cookielib, urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
I'd like to store the cookies in a database. I don't know whats better - serialize the CookieJar
object and store it or extract the cookies from the CookieJar
and store that. I don't know which one's better or how to implement either of them. I should be also be able to recreate the CookieJar
object.
Could someone help me out with the above?
Thanks in advance.
cookielib.Cookie
, to quote its docstring (in its sources),
is deliberately a very simple class. It just holds attributes.
so pickle
(or other serialization approaches) are just fine for saving and restoring each Cookie
instance.
As for CookieJar
, set_cookie
sets/adds one cookie instance, __iter__
(to use the latter, just do a for
loop on the jar instance) returns all cookie instances it holds, one after the other.
A subclass that you can use to see how to make a "cookie jar on a database" is BSDDBCookieJar (part of mechanize
, but I just pointed specifically to the jar source code file) -- it doesn't load all cookies in memory, but rather keeps them in a self._db
which is a bsddb instance (mostly-on-disk, dict-lookalike hash table constrained to having only strings as keys and values) and uses pickle for serialization.
If you are OK with keeping every cookie in memory during operations, simply pickle
ing the jar is simplest (and, of course, put the blob in the DB and get it back from there when you're restarting) -- s = cPickle.dumps(myJar, -1)
gives you a big byte string for the whole jar (and policy thereof, of course, not just the cookies), and theJar = cPickle.loads(s)
rebuilds it once you've reloaded s
as a blob from the DB.