c++c++11iteratoristreamistream-iterator

istream_iterator behavior misunderstanding


The goal is to read 16 bit signed integers from a binary file. First, I open the file as an ifstream, then I would like to copy each numbers into a vector using istream_iterator and the copy algorithm. I dont' understand what's wrong with this snippet:

int main(int argc, char *argv[]) {
    std::string filename("test.bin");

    std::ifstream is(filename);
    if (!is) {
        std::cerr << "Error while opening input file\n";
        return EXIT_FAILURE;
    }
    
    std::noskipws(is);
    std::vector<int16_t> v;
    std::copy(
        std::istream_iterator<int16_t>(is),
        std::istream_iterator<int16_t>(),
        std::back_inserter(v)
    );

    //v is still empty
}

This code produces no error but the vector remains empty after the call to std::copy. Since I'm opening the file in the standard input mode ("textual" mode), I was expecting istream_iterator to work even if the file is binary. Of course there's something I'm missing about the behavior of this class.


Solution

  • First off, to read a binary file with ifstream, you need to open the file in binary mode, not text mode (the default). Otherwise, read operations may mis-interpret linebreak bytes and translate them between platform encodings (ie, CRLF->LF, or vice versa), thus corrupting your binary data.

    Second, istream_iterator uses operator>>, which reads and parses formatted text by default, which is not what you want when reading a binary file. You need to use istream::read() instead. However, there is no iterator wrapper for that (but you can write your own if needed).

    Try this instead:

    int main(int argc, char *argv[]) {
        std::string filename = "test.bin";
    
        std::ifstream is(filename, std::ifstream::binary);
        if (!is) {
            std::cerr << "Error while opening input file\n";
            return EXIT_FAILURE;
        }
    
        std::vector<int16_t> vec;
        int16_t value;
    
        while (is.read(reinterpret_cast<char*>(&value), sizeof(value))) {
            // swap value's endian, if needed...
            vec.push_back(value);
        }
    
        // use vec as needed...
    
        return 0;
    }
    

    That being said, if you really want to use istream_iterator for a binary file, then you would have to write a custom class/struct to wrap int16_t, and then define an operator>> for that type to call read(), eg:

    struct myInt16_t {
        int16_t value; 
        operator int16_t() const { return value; }
    };
    
    std::istream& operator>>(std::istream &is, myInt16_t &v) {
        if (is.read(reinterpret_cast<char*>(&v.value), sizeof(v.value))) {
            // swap v.value's endian, if needed...
        }
        return is;
    }
    
    int main(int argc, char *argv[]) {
        std::string filename = "test.bin";
    
        std::ifstream is(filename, std::ifstream::binary);
        if (!is) {
            std::cerr << "Error while opening input file\n";
            return EXIT_FAILURE;
        }
    
        std::noskipws(is);
        std::vector<int16_t> vec;
        std::copy(
            std::istream_iterator<myInt16_t>(is),
            std::istream_iterator<myInt16_t>(),
            std::back_inserter(vec)
        );
    
        // use vec as needed...
    
        return 0;
    }