I know this question will looks a bit unclear but I reached a level of frustration that drives me to ask this here..
I'm working with data from a POSTGRESQL database, and I get something like this:
2022-06-01 02:21:52.770293 2022-06-01 02:21:52.78704 \\x0a78daa5534d6fe32014fc2fdca90063b0c9a91f52d...
2022-06-01 02:21:55.991809 2022-06-01 02:21:56.04597 \\x0a78dac5534d6be33010fd2fbe2b58b264c9caa9ed4...
I know that the counter column in a compressed string that contains JSON-like data.
I know (because it was already decompressed in the past) that the usage of zlib
package can decompress this string (by something like zlib.decompress(mycompressedstring)
But there is a missing step here because this string \\x0a78...
is not decompressable.
I suspect there is an encoding-decoding work to do before calling zlib but I struggle to find what to do..
I tried:
test = bytes(sample.iloc[1]['counter'], 'UTF16')
I was thinking it is better but zlib cannot decompress this
testunc = zlib.decompress(test)
error: Error -3 while decompressing data: incorrect header check
Please, can someone help me there? Bu giving me a track to follow to fing what is wrong with this...
The hexadecimal representations starting with 78da
... are the starts of valid zlib streams. You need to discard the \\x0a
and convert the remainder from hexadecimal to binary. The result of that would be given to zlib.decompress()
. Look at a2b_hex
in binascii.