c++stdvectoriostreamifstream

How to read data from a binary file to a vector in c++?


I want to read a binary file into a vector. Everything seems to work fine, since all the values have the correct values.

Except the binary values/data isn't inputted into data or rather data isn`t getting populated.

void loadDataFromFile(const std::string& filePath, std::vector<DataArray>& data) 
{
    std::ifstream inFile(filePath, std::ios::binary);
    if (!inFile) {
        std::cerr << "Error opening file!" << std::endl;
    }
    
    // Get the size of the file
    inFile.seekg(0, std::ios::end);
    std::streamsize file_size = inFile.tellg();
    inFile.seekg(0, std::ios::beg);

    constexpr std::size_t bufferSize = 700 * 1024 * 1024; // 700 MB buffer size (adjust as needed)

    std::size_t elementSize = sizeof(DataArray);
    std::size_t elementsPerBuffer = bufferSize / elementSize;
    std::size_t totalElements = file_size / elementSize;

    data.reserve(totalElements);
    
    std::size_t offset = 0;
    while (offset < totalElements) {
        std::size_t count = std::min(elementsPerBuffer, totalElements - offset);

        inFile.read(reinterpret_cast<char*>(&data[offset]), count * elementSize);

        offset += count;
    }

    inFile.close();
}

Solution

  • The issue does not actually relate to reading the data from the file, but rather to the way you attempted to set the size of the vector before populating it.

    In this line:

    data.reserve(totalElements);
    

    You call std::vector::reserve which will:

    Increase the capacity of the vector (the total number of elements that the vector can hold without requiring reallocation) to a value that's greater or equal to new_cap.

    The actual size() of the vector does not change and accessing elements with index >= the current size will invoke undefined-behavior.

    Instead, since you want to populate the vector with totalElements elements, you should call std::vector::resize:

    data.resize(totalElements);
    

    This will change the size() of the vector and will be able to populate the totalElements elements.

    Note:
    Creating objects (the elements in your vector - of type DataArray) by copying raw bytes like you do will work only for trivially copyable / trivially constructible types. If your DataArray isn't such a type you will need to do some serializations - preferably using an existing library.