I have a tarfile.TarFile
from which I would like to extract some files to a modified destination filename; there is an existing file with the same name as the archive member that I do not want to touch.
Specifically, I want to append a suffix, e.g. a member in the archive called foo/bar.txt
should be extracted as foo/bar.txt.mysuffix
.
The two somewhat obvious but also somewhat unsatisfactory approaches are:
extractfile
, create renamed file and copy content using shutil.copyfileobj
; however, this is either limited to regular files or all the special handling, e.g for sparse files, symlinks, directories etc. implemented in tarfile
would have to be replicated.extractall
to a temporary directory and then rename and copy to destination; this just feels unnecessarily convoluted, requires more interaction with the host system and introduces new failure modes, and it seems easy to get this subtly wrong (e.g. see warnings on shutil.copy/copy2
).Is there no interface or hook on the TarFile
that would allow to implement this concisely and correctly?
Looking through Lib/tarfile.py
, I came across this comment:
#--------------------------------------------------------------------------
# Below are the different file methods. They are called via
# _extract_member() when extract() is called. They can be replaced in a
# subclass to implement other functionality.
def makedir(self, tarinfo, targetpath):
#...
def makefile(self, tarinfo, targetpath):
# ...
These methods are not mentioned in the official reference documentation, but they appear to be fair game. To overwrite these on an existing open TarFile
instance, we can create a subclass Facade/Wrapper:
class SuffixingTarFile(tarfile.TarFile):
def __init__(self, suffix: str, wrapped: tarfile.TarFile):
self.suffix = suffix
self.wrapped = wrapped
def __getattr__(self, attr):
return getattr(self.wrapped, attr)
def makefile(self, tarinfo, targetpath):
super().makefile(tarinfo, targetpath + self.suffix)
# overwrite makedir, makelink, makefifo, etc. as desired
Example:
tar = tarfile.open(...)
star = SuffixingTarFile(".foo", tar)
star.extractall() # extracts all (regular) file members with .foo suffix appended