I am verifying a Hardware Design block which does decompression (inflate). The decompressed data output should always be 4 KiB. As test data I am compressing chunks of 4 KiB data at a time using zlib's deflate, and providing that as input to my test. I ran multiple regressions and I am never observing a case where the code length is 15. Do you have any suggestions on how to get that, or why it is not possible?
Here you go:
eF4F4cGBZdmybbnJijFt7eOR931S/14B////3//7f//v//7v//7vf//73//+97///ffff//9
999///3333///v379+/fv3///v379+/fv3///v39/f39/f39/f39/f39/f39/f39/f39/f39
/f39/X6/3+/3+/1+v9/v9/v9fr/f7/f7/X6/3+/3+/1+v9/v9/v9fr/f7/f7vu/7vu/7vu/7
vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu/7vu9777333nvv
vffee++9995777333nvvvffee++9995777333nvvvffee++9995777333nvvvffee++99957
77333nvvvffee++99957793d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d
3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d
3d3d3d3d3d22bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2
bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2bdu2
bdu2bdu2bdu2bdu2bdu2bdu2bdtWVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVFQAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAA8P8BVoseLg==
That is a Base64 encoding of a zlib stream that decompresses to 4096 bytes, and that has 15-bit symbols. It was constructed by generating the Lucas numbers, 2, 1, 3, 4, 7, 11, ..., 521, 843. The initial 2 is decremented to 1, to account for the end-of-block symbol in deflate. Then 15 symbols are emitted with those frequencies. (I chose the lower-case letters a
..o
, with a
appearing 843 times.) That results in a sequence of 2205 bytes, which, with the end-of-block symbol, is the smallest possible input that can result in a 15-bit code. That is less than your 4096, so it is indeed possible to generate the test vector you are looking for.
I then appended another 1891 a
's, to fill it out to 4096 bytes. That does not change the resulting Huffman code. You then take that sequence and compress with zlib using the Huffman-only strategy (Z_HUFFMAN_ONLY
in zlib, or pigz -zH
), in order to avoid LZ77 compression of the long, repeated strings of symbols.
If you just want a raw deflate stream, then remove the first two and last four bytes of the zlib stream.