I am trying to grab a single file from a tar archive. I have the tarfile library and I can do things like find the file in a list with the right extension:
like their example:
def xml_member_files(self,members):
for tarinfo in members:
if os.path.splitext(tarinfo.name)[1] == ".xml":
yield tarinfo
member_file = self.xml_member_files(tar)
for m in member_file:
print m.name
This is great and the output is:
RS2_C0RS2_OK67683_PK618800_DK549742_SLA23_20151006_234046_HH_SLC/lutBeta.xml
RS2_C0RS2_OK67683_PK618800_DK549742_SLA23_20151006_234046_HH_SLC/lutGamma.xml
RS2_C0RS2_OK67683_PK618800_DK549742_SLA23_20151006_234046_HH_SLC/lutSigma.xml
RS2_C0RS2_OK67683_PK618800_DK549742_SLA23_20151006_234046_HH_SLC/product.xml
If I say just look for product.xml then it doesn't work. So I tried this:
ti = tar.getmember('product.xml')
print ti.name
and it doesn't find product.xml because I am guessing the path information before hand. I have no idea how to retrieve just that pathing information so I can get at my product.xml file once extracted (feels like I am doing things the hard way anyway) but yah, how do I figure out just that path so I can concatenate it to my other file functions to read and load that xml file after it is the only file extracted from a tar file?
Return full path by iterating over result of getnames()
. For example, to get full path for lutBeta.xml
:
tar = tarfile.TarFile('mytarfile.tar')
membername = [x for x in tar.getnames() if os.path.basename(x) == 'lutBeta.xml'][0]