pythonpython-3.xcompressiongzip

Python gzip decompression failing


I tried using Python v3.12 gzip package with the following code snippet and I am getting running into errors.

Code snippet I tried:

with gzip.open('/tmp/' + object_name + '.gz', 'rb') as f:
      file_content = f.read()
gzip_decompressed_byte_output=gzip.decompress(file_content)
open('/tmp/' + object_name, "wb").write(gzip_decompressed_byte_output) 
print("Directory contents after Docoding Gzip: ", os.listdir("/tmp/"))

Error:

gzip_decompressed_byte_output=gzip.decompress(file_content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.12/gzip.py", line 627, in decompress
if _read_gzip_header(fp) is None:
^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.12/gzip.py", line 456, in _read_gzip_header
raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'<?')

Here is the input file format (used magic library to check the file format):


gzip compressed data, was "text.xml.gz", last modified: Wed Aug 21 01:26:59 2024, from Unix, original size modulo 2^32 16153477

The same file I am able to decompress manually using gzip -d unix command. Please guide if I messed it up?

Tried using python gzip library


Solution

  • It looks like you're trying to decompress twice. The second time is showing you the first two bytes of the decompressed data, file_content, <?, which is in fact the usual first two bytes of XML, <?xml...

    Simply save file_content instead of trying gzip.decompress on the decompressed data.