My python code is trying to decompress a uuencoded file using the zlib library. Here is the code snippet:
self.decompress = zlib.decompressobj(wbits)
.
.
buf = self.fileobj.read(size)
.
.
uncompress = self.decompress.decompress(buf)
My current value for wbits is '-zlib.MAX_WBITS'. This throws an error:
Error -3 while decompressing: invalid literal/lengths set
I realize that the python zlib library supports:
RFC 1950 (zlib compressed format)
RFC 1951 (deflate compressed format)
RFC 1952 (gzip compressed format)
and the choice for wbits is to be:
to (de-)compress deflate format, use wbits = -zlib.MAX_WBITS
to (de-)compress zlib format, use wbits = zlib.MAX_WBITS
to (de-)compress gzip format, use wbits = zlib.MAX_WBITS | 16
So my questions are:
Where does a uuencoded file fall in this list?
Is it supported by zlib?
If yes, what should be the value for wbits?
If no, how do I proceed with this?
Thanks in advance!
Here's a quick demo of how to compress with zlib and encode with uuencode, and then reverse the procedure.
#!/usr/bin/env python
import zlib
data = '''This is a short piece of test data
intended to test uuencoding and decoding
using the uu module, and compression and
decompression using zlib.
'''
data = data * 5
# encode
enc = zlib.compress(data, 9).encode('uu')
print enc
# decode
dec = zlib.decompress(enc.decode('uu'))
#print `dec`
print dec == data
output
begin 666 <data>
M>-KMCLL-A# ,1.^I8@I 5$,#(?822V C[%RV>CXY; %[19K+/,U(;ZKBN)+A
MU8[ +EP8]D&P!RA'3J+!2DP(Z[0UUF(DNB K@;B7U/Q&4?E:8#-J*P_/HMBV
;'^PNID]/]^6'^N^[RCRFZ?5Y??[P.0$_I03L
end
True
The code above will only work on Python 2. Python 3 makes a clear separation between text and bytes, and it doesn't support the encoding of bytes strings, or the decoding of text strings. So it can't use the simple uuencoding / uudecoding technique shown above.
Here's a new version that works on both Python2 and Python 3.
from __future__ import print_function
import zlib
import uu
from io import BytesIO
def zlib_uuencode(databytes, name='<data>'):
''' Compress databytes with zlib & uuencode the result '''
inbuff = BytesIO(zlib.compress(databytes, 9))
outbuff = BytesIO()
uu.encode(inbuff, outbuff, name=name)
return outbuff.getvalue()
def zlib_uudecode(databytes):
''' uudecode databytes and decompress the result with zlib '''
inbuff = BytesIO(databytes)
outbuff = BytesIO()
uu.decode(inbuff, outbuff)
return zlib.decompress(outbuff.getvalue())
# Test
# Some plain text data
data = '''This is a short piece of test data
intended to test uuencoding and decoding
using the uu module, and compression and
decompression using zlib.
'''
# Replicate the data so the compressor has something to compress
data = data * 5
#print(data)
print('Original length:', len(data))
# Convert the text to bytes & compress it.
databytes = data.encode()
enc = zlib_uuencode(databytes)
enc_text = enc.decode()
print(enc_text)
print('Encoded length:', len(enc_text))
# Decompress & verify that it's correct
dec = zlib_uudecode(enc)
print(dec == databytes)
output
Original length: 720
begin 666 <data>
M>-KMCLL-A# ,1.^I8@I 5$,#(?822V C[%RV>CXY; %[19K+/,U(;ZKBN)+A
MU8[ +EP8]D&P!RA'3J+!2DP(Z[0UUF(DNB K@;B7U/Q&4?E:8#-J*P_/HMBV
;'^PNID]/]^6'^N^[RCRFZ?5Y??[P.0$_I03L
end
Encoded length: 185
True
Please note that zlib_uuencode
and zlib_uuencode
work on bytes
strings: you must pass them a bytes
arg, and they return a bytes
result.