I tried using Python v3.12 gzip package with the following code snippet and I am getting running into errors.
Code snippet I tried:
with gzip.open('/tmp/' + object_name + '.gz', 'rb') as f:
file_content = f.read()
gzip_decompressed_byte_output=gzip.decompress(file_content)
open('/tmp/' + object_name, "wb").write(gzip_decompressed_byte_output)
print("Directory contents after Docoding Gzip: ", os.listdir("/tmp/"))
Error:
gzip_decompressed_byte_output=gzip.decompress(file_content)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.12/gzip.py", line 627, in decompress
if _read_gzip_header(fp) is None:
^^^^^^^^^^^^^^^^^^^^^
File "/var/lang/lib/python3.12/gzip.py", line 456, in _read_gzip_header
raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'<?')
Here is the input file format (used magic library to check the file format):
gzip compressed data, was "text.xml.gz", last modified: Wed Aug 21 01:26:59 2024, from Unix, original size modulo 2^32 16153477
The same file I am able to decompress manually using gzip -d unix command. Please guide if I messed it up?
Tried using python gzip library
It looks like you're trying to decompress twice. The second time is showing you the first two bytes of the decompressed data, file_content
, <?
, which is in fact the usual first two bytes of XML, <?xml
...
Simply save file_content
instead of trying gzip.decompress
on the decompressed data.