pythonzstandard

Using zstandard to compress a file in Python


So I'm using the zstandard python library, and I've written a helper class and function to use contexts to decompress files.

class ZstdReader:
    def __init__(self, filename):
        self.filename = filename

    def __enter__(self):
        self.f = open(self.filename, 'rb')
        dctx = zstd.ZstdDecompressor()
        reader = dctx.stream_reader(self.f)
        return io.TextIOWrapper(reader, encoding='utf-8')

    def __exit__(self, *a):
        self.f.close()
        return False

def openZstd(filename, mode='rb'):
    if 'w' in mode:
        return ZstdWriter(filename)
    return ZstdReader(filename)

This works really well and allows me to just use with openZstd('filename.zst', 'rb') as f: before using the file f for json dumping and loading. I'm however having issues generalizing this to writing, I've tried following the documentation in the same way I did so far but something is not working. Here's what I've tried:

class ZstdWriter:
    def __init__(self, filename):
        self.filename = filename

    def __enter__(self):
        self.f = open(self.filename, 'wb')
        ctx = zstd.ZstdCompressor()
        writer = ctx.stream_writer(self.f)
        return io.TextIOWrapper(writer, encoding='utf-8')

    def __exit__(self, *a):
        self.f.close()
        return False

When I open a file using this class, and do a json.dump([], f), the file ends up being empty for some reason. I guess one of the steps is swallowing my input, but have no idea what it could possibly be.


Solution

  • As suggested by jasonharper in the comments, you have to flush both the io wrapper and the writer itself, as follows:

    s = json.dumps({})
    iw = io.TextIOWrapper(writer, encoding="utf-8")
    iw.write(s)
    
    iw.flush()
    writer.flush(zstd.FLUSH_FRAME)
    f.close()
    

    This results on the data being in the file, and the file being complete.