c++boostcompressiongzipifstream

How to re-inititate read after reaching EOF during stream decompression with Boost:iostreams?


I am trying to realize a streaming de-compressor with Boost:iostreams that could work with incomplete compressed files (the size of the uncompressed file is known before the decompression starts). Basically, I run the compressor and decompressor simultaneously and since compressor is slower than decompressor, decompressor reaches the end of file. I am trying to reset the stream to re-initiate the read operation but I could not realize it. gcount() still returns 0 after clear() and seekg(0). My ultimate goal is to realize a mechanism that would continue from the point where the end of file is reached, instead of returning to the beginning. But, I cannot even return to the beginning of the file.

I would appreciate any kind of support. Thank you in advance.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/filtering_stream.hpp>

const std::size_t bufferSize = 1024;
const std::size_t testDataSize = 13019119616; 

int main() {

    // Decompress
    std::ofstream outStream("image_boost_decompressed.img", std::ios_base::out);
    std::ifstream inStream("image_boost_compressed.img.gz", std::ios_base::in | std::ios_base::binary);
    
    boost::iostreams::filtering_istream out;
    out.push(boost::iostreams::gzip_decompressor());
    out.push(inStream);

    char buf[bufferSize] = {};

    std::cout << "Decompression started!" << std::endl;

    std::size_t loopCount = 0;
    std::size_t decompressedDataSize = 0;

    while(decompressedDataSize < testDataSize) {
        std::cout << "cursor bef: " << inStream.tellg() << std::endl; 

        out.read(buf, bufferSize);

        std::cout << "read size: " << out.gcount() << std::endl;
        std::cout << "cursor after: " << inStream.tellg() << std::endl; 

        if (out.gcount() > 0) {
            outStream.write(buf, out.gcount());
            decompressedDataSize = decompressedDataSize + out.gcount();
        } else if (out.gcount() == 0) {
            std::cout << "clear initiated!" << std::endl;
            inStream.clear();
            inStream.seekg(0)
        }
        std::cout << "----------------" << std::endl;
    }

    std::cout << "Decompression ended!" << std::endl;
    std::cout << "decompressed data size: " << decompressedDataSize << std::endl;
    outStream.close();

    return 0;
}



Solution

  • If you want to pick up where you left off, then use seekg(0, std::ios_base::cur). It works:

    #include <iostream>
    #include <fstream>
    
    int main() {
        std::ofstream out("test.out");
        out << "line 1\n";
        out.flush();
        std::ifstream in("test.out");
        char line[256];
        in.read(line, sizeof(line));
        line[in.gcount()] = 0;
        std::cout << line;
        if (in.eof())
            std::cout << "-- at eof\n";
        out << "line 2\n";
        out.flush();
        in.clear();
        if (in.good())
            std::cout << "-- now good!\n";
        in.seekg(0, std::ios_base::cur);
        in.read(line, sizeof(line));
        line[in.gcount()] = 0;
        std::cout << line;
        in.close();
        out.close();
    }
    

    As for the decompressor, you don't want to let it see an end-of-input indicator. Run the decompressor separately, and provide it only what you have read so far.