I'm attempting to read a tar file, identify some files, read them, and then write a new file to the same tarfile with Python. It appears extractfile() is only allowed if the mode is "r". Is this the case? Is there a way to both extract files from a tar in memory and also append new files to the tar at the same time? Sample code below:
def genEntry(tar, tarinfo, source):
heading = re.compile(r'#+(\s+)?')
f = tar.extractfile(tarinfo)
f.seek(0)
while True:
line = f.readline().decode()
print(line)
if not line:
break
print(line)
if heading.match(line):
title = heading.sub('',line).replace('\n','')
return[tarinfo.name.replace(source,'.'), title]
return [tarinfo.name.replace(source,'.'), tarinfo.name.replace(source,'')]
with tarfile.open(args.source, mode='a') as tar:
source = 'somepath'
subDir = 'someSubDir'
path = '/'.join((source, subDir))
if tar.getmember(path):
pathre = re.compile(r'{}\/.+?\/readme\.md'.format(re.escape(path)), re.IGNORECASE)
for tarinfo in tar.getmembers():
if re.search(pathre, tarinfo.name):
genEntry(tar, tarinfo, source)
...
This will generate the following error:
OSError: bad operation for mode 'a'
As far as I can tell, it is not possible to read from and append to a tarfile in one pass. While I eventually went in the direction of facilitating streaming the tarfile in and out of my Python script, I did identify a two-pass read/write solution for my question above.
Here's essentially the approach I landed on.
files = []
with tarfile.open(tarpath) as tar:
files= readTar(tar)
with tarfile.open(tarpath, mode='a') as tar:
for fileobj in files:
writeFile(tar, fileobj[0], fileobj[1])
def readTar(tar):
# Your your logic to build the files you want to build in the amended file here
def writeFile(tar, tarinfo, payload):
if len(payload) != 0:
data = payload.encode('utf8')
tarinfo.mode = 0o444
tarinfo.size = len(data)
tar.addfile(tarinfo, fileobj=BytesIO(data))
return tar