python-3.xtarfile

Python: readlines() with tar-file gives StreamError: seeking backwards is not allowed. What is wrong?


Ok, I saw similar questions but not exactly the same. And I can't figure out what is really wrong with this python code:

import tarfile

tar_file = tarfile.open('something.tgz', mode="r|gz")
txt_file = tar_file.extractfile('inner.txt')
lines = txt_file.readlines()    
txt_file.close()
tar_file.close()

It gives StreamError: seeking backwards is not allowed due to readlines().
But this fact looks strange to me and I try to understand what I miss here.


Solution

  • The problem is with this line:

    tar_file = tarfile.open('something.tgz', mode="r|gz")
    

    According to the tarfile.open() docs, the correct mode should be either "r" - Open for reading with transparent compression (recommended) or "r:gz" - Open for reading with gzip compression. Using the pipe | character creates a stream:

    Use this variant in combination with e.g. sys.stdin, a socket file object or a tape device. However, such a TarFile object is limited in that it does not allow random access

    which is where you ran into problems with readlines() and seek(). When I changed that pipe | to a colon :, your code worked fine.