pythoncookieshttponlycookiejar

How to remove `#HttpOnly_` prefix before MozillaCookieJar load cookies.txt?


I use Firefox's extension (cookies.txt) export cookies.txt for python script. And there is some HttpOnly cookie begins with "#HttpOnly_" was ignored in MozillaCookieJar, just like a comment, e.g.:

#HttpOnly_.sample.com    TRUE    /    FALSE    1258200001    value1    111
#HttpOnly_.sample.com    FALSE   /    FALSE    1209905939    value2    222

Here is a issue tracker : MozillaCookieJar ignores HttpOnly cookies with some code snippets maybe fix the problem like this:

from tempfile import NamedTemporaryFile
from http.cookiejar import MozillaCookieJar
from contextlib import contextmanager

def fix_cookie_jar_file(orig_cookiejarfile):
    with NamedTemporaryFile(mode='w+') as cjf:
        with open(orig_cookiejarfile, 'r') as ocf:
            for l in ocf:
                cjf.write(l[10:] if l.startswith('#HttpOnly_') else l)
        cjf.seek(0)
        yield cjf.name

# the following code is TypeError: expected str, bytes or os.PathLike object, not generator
# Ref: https://bugs.python.org/issue2190
MozillaCookieJar(filename=fix_cookie_jar_file('d:/cookies.txt'))

How can I fix the TypeError ?

Thanks.

update: the follow code works as expected.

from tempfile import NamedTemporaryFile
from http.cookiejar import MozillaCookieJar
from contextlib import contextmanager
import os

# Ref: https://bugs.python.org/issue2190
def fix_cookie_jar_file(orig_cookiejarfile):
    with NamedTemporaryFile(mode='w+', delete=False) as cjf:
        with open(orig_cookiejarfile, 'r') as ocf:
            for l in ocf:
                cjf.write(l[10:] if l.startswith('#HttpOnly_') else l)
        cjf.seek(0)
        return cjf.name

filename = fix_cookie_jar_file(r"d:\cookies.txt")
jar = MozillaCookieJar(filename)
jar.load(ignore_expires = True)

# delete "manually" afterwards
os.remove(filename)

i=0
for ck in jar:
    print("(%d) %s : %s"%(i,ck.name,ck.value))
    i+=1

open() function can't accept generator as parameter, is there some decorator or wrapper for open() function can accept generator as parameter?


Solution

  • You can solve the issue by modifying the following line:

    MozillaCookieJar(filename=fix_cookie_jar_file('d:/cookies.txt'))
    

    To this:

    MozillaCookieJar(filename=next(fix_cookie_jar_file('d:/cookies.txt')))
    

    Here, we are using next to iterate over the elements of the generator, as the first element will be the name of the temporary file, it will work for the MozillaCookieJar as it will be receiving just a string instead of the generator itself.

    Hope it helps!

    UPDATE:

    I see that you updated the question. I did some refactor to your code to fix the issue. The code works and here it is:

    from http.cookiejar import MozillaCookieJar
    import os
    
    temp_filename = "temp_file.txt"
    
    # Ref: https://bugs.python.org/issue2190
    def remove_httponly(cookiejar_file):
      cookies = []
      with open(cookiejar_file, 'r') as cookie_file:
        for line in cookie_file:
          cookies.append(line[10:] if line.startswith('#HttpOnly_') else line)
      with open(temp_filename, 'w') as temp_file:
        temp_file.write(''.join(cookies))
        return True
    
    if remove_httponly("d:/cookies.txt"):
      new_cookie_jar = MozillaCookieJar(temp_filename)
      new_cookie_jar.load(ignore_expires = True)
    
      # delete "manually" afterwards
      os.remove(temp_filename)
    
      id = 0
      for cookie in new_cookie_jar:
        print(f"({id}) {cookie.name} : {cookie.value}")
        id += 1
    

    Note: Don't forget to verify that your cookies.txt file is in the right format to be read by MozillaCookieJar. Otherwise, another separated error will appear.