pythongzip

Decompress remote .gz file in Python


i've a issue with Python.

My case: i've a gzipped file from a partner platform (i.e. h..p//....namesite.../xxx) If i click the link from my browser, it will download a file like (i.e. namefile.xml.gz).

So... if i read this file with python i can decompress and read it.

Code:

content = gzip.open(namefile.xml.gz,'rb')
print content.read()

But i can't if i try to read the file from remote source. From remote file i can read only the encoded string, but not decoded it.

Code:

response = urllib2.urlopen(url)
encoded =response.read()
print encoded

With this code i can read the string encoded... but i can't decoded it with gzip or lzip.

Any advices? Thanks a lot


Solution

  • Unfortunately the method @Aya suggests does not work, since GzipFile extensively uses seek method of the file object (not supported by response).

    So you have basically two options:

    1. Read the contents of the remote file into io.StringIO, and pass the object into gzip.GzipFile (if the file is small)

    2. download the file into a temporary file on disk, and use gzip.open

    There is another option (which requires some coding) - to implement your own reader using zlib module. It is rather easy, but you will need to know about a magic constant (How can I decompress a gzip stream with zlib?).