I am getting an error while trying to extract a .tar.gz
archive using the tarfile
library.
Here is the relevant code snippet:
# `gzip_archive_bytes_content` is the content of the gzip archive, in "bytes" format
repo_sources_file_object = io.BytesIO(gzip_archive_bytes_content)
repo_sources_tar_object = tarfile.TarFile(fileobj=repo_sources_file_object)
repo_sources_tar_object.extractall(path="/tmp/")
This the error I get:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/tarfile.py", line 186, in nti
s = nts(s, "ascii", "strict")
File "/usr/local/lib/python3.7/tarfile.py", line 170, in nts
return s.decode(encoding, errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x9a in position 1: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/tarfile.py", line 2289, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/usr/local/lib/python3.7/tarfile.py", line 1095, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/usr/local/lib/python3.7/tarfile.py", line 1037, in frombuf
chksum = nti(buf[148:156])
File "/usr/local/lib/python3.7/tarfile.py", line 189, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/site-packages/my-package/__main__.py", line 87, in <module>
function(**function_args)
File "/usr/local/lib/python3.7/site-packages/my-package/chart.py", line 107, in reinstall
install()
File "/usr/local/lib/python3.7/site-packages/my-package/chart.py", line 89, in install
repo_sources_tar_object = tarfile.TarFile(fileobj=repo_sources_file_object)
File "/usr/local/lib/python3.7/tarfile.py", line 1484, in __init__
self.firstmember = self.next()
File "/usr/local/lib/python3.7/tarfile.py", line 2301, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header
Python version: 3.7
I switched from instanciating directly a tarfile.TarFile
object to using the tarfile.open()
constructor, and it fixed it:
repo_sources_tar_object = tarfile.open(fileobj=repo_sources_file_object)
There is actually a warning on this in the documentation, here:
Do not use this class directly: use tarfile.open() instead.