c++vectorcompressionzstd

How to correctly compress a vector using ZSTD simple API?


I'm new to C++ and I wanted to compress a vector via ZSTD compression library. I used ZSTD simple API ZSTD_compress and ZSTD_decompress in the same way as the example. But I found a wired issue that when I compressed and decompressed a vector, the decompressed vector was not the same as the original vector. I'm not sure which part of my operation went wrong. I looked at ZSTD's GitHub homepage and didn't find an answer. Please help or try to give some ideas how to solve it.


Example C code: https://github.com/facebook/zstd/blob/dev/examples/simple_compression.c

//Initialize a vector
    vector<int> NumToCompress ;
    NumToCompress.resize(10000);
    for(int i = 0; i < 10000; i++)
    {
        NumToCompress[i] = rand()% 255;
    }
    //compress
    int* com_ptr = NULL;
    size_t NumSize = NumToCompress.size();
    size_t Boundsize = ZSTD_compressBound(NumSize);
    com_ptr =(int*) malloc(Boundsize);
    size_t ComSize;
    ComSize = ZSTD_compress(com_ptr,Boundsize,NumToCompress.data(),NumToCompress.size(),ZSTD_fast);
    //decompress
    int* decom_ptr = NULL;
    unsigned long long decom_Boundsize;
    decom_Boundsize = ZSTD_getFrameContentSize(com_ptr, ComSize);
    decom_ptr = (int*)malloc(decom_Boundsize);
    size_t  DecomSize;
    DecomSize = ZSTD_decompress(decom_ptr, decom_Boundsize, com_ptr, ComSize);
    vector<int> NumAfterDecompress(decom_ptr,decom_ptr+DecomSize);
    //check if two vectors are same
    if(NumToCompress == NumAfterDecompress)
    {
        cout << "Two vectors are same" << endl;
    }else
    {cout << "Two vectors are insame" << endl;}
    free(com_ptr);
    free(decom_ptr);

case 1: If zstd can compress std::vector directly? case 2: How to properly compress vectors with zstd if zstd can compress std::vector directly ?

Two vectors are insame
Original vector:
163     151     162     85      83      190     241     252     249     121     107     82      20      19      233     226     45      81      142     31      86      8       87      39      167     5       212   208      82      130     119     117     27      153     74      237     88      61      106     82      54      213     36      74      104     142     173     149     95      60      53      181     196     140   221      108     17      50      61      226     180     180     89      207     206     35      61      39      223     167     249     150     252     30      224     102     44      14      123     140     202   48       66      143     188     159     123     206     209     184     177     135     236     138     214     187     46      21      99      14
Decompressed vector:
163     151     162     85      83      190     241     252     249     121     107     82      20      19      233     226     45      81      142     31      86      8       87      39      167     0       417   0551929248       21916   551935408       21916   551933352       21916   551939512       21916   0       0       0       0       0       0       0       0       0       0       0       0       0       0       0     00       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0     00       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0

Solution

  • Seems likely that your API expects the size to be in bytes but you give the size as the number of elements. So you need to multiply the number of elements by the size of each element. Like this

    ComSize = ZSTD_compress(com_ptr, Boundsize, NumToCompress.data(), 
        NumToCompress.size()*sizeof(int), ZSTD_fast);
    

    and similarly when you decompress you need to divide by the element size

    DecomSize = ZSTD_decompress(decom_ptr, decom_Boundsize, com_ptr, ComSize);
    vector<int> NumAfterDecompress(decom_ptr, decom_ptr+DecomSize/sizeof(int));