pythonzstandard

Decompression does not work for own file


I'm realativly new to the python programming language and i ran into a problem with the module zstandard. I'm currently working with the replayfiles of Halite. Since they are compressed with zstandard, i have to use this module. And if i read a file, everything is fine! I can decompress the ".hlt" files.

But i've done some transformations of the json data that i want to save on disk to use later. I find it very useful to store the data compressed again, so i used the compressor. The compression works fine, too. However, if i open the file i just created again, i get an error message reading: "zstd.ZstdError: decompression error: Unknown frame descriptor".

Have a look on my code below:

def getFileData(self, filename):
    with open(filename, "rb") as file:
        data = file.read()
    return data

def saveDataToFile(self, filename, data):
    with open(filename, "bw") as file:
        file.write(data)

def transformCompressedToJson(self, data, beautify=0):
    zd = ZstdDecompressor()
    decompressed = zd.decompress(data, len(data))
    return json.loads(decompressed)

def transformJsonToCompressed(self, jsonData, beautify=0):
    zc = ZstdCompressor()
    if beautify > 0:
        jsonData = json.dumps(jsonData, sort_keys=True, indent=beautify)
    objectCompressor = zc.compressobj()
    compressed = objectCompressor.compress(jsonData.encode())
    return objectCompressor.flush()

And i am using it here:

rp = ReplayParser()

gameDict = rp.parse('replays/replay-20180215-152416+0100--4209273584-160-160-278627.hlt')

compressed = rp.transformJsonToCompressed(json.dumps(gameDict, sort_keys=False, indent=0))

rp.saveDataToFile("test.cmp", compressed)

t = rp.getFileData('test.cmp')
j = rp.transformCompressedToJson(t) -> Here is the error
print(j)

The function rp.parse(..) just transforms the data - so it just creates a dictionary .. The rp.parse(..) function also calls transformCompressedToJson, so it is working fine for the hlt file.

Hopefully, you guys can help me with this.

Greethings,

Noixes


Solution

  • In transformJsonToCompressed(), you are throwing away the result of the .compress() method (which is likely going to be the bulk of the output data), and instead returning only the result of .flush() (which will just be the last little bit of data remaining in buffers). The normal way to use a compression library like this would be to write each chunk of compressed data directly to the output file as it is generated. Your code isn't structured to allow that (the function knows nothing about the file the data will be written to), so instead you could concatenate the two chunks of compressed data and return that.