c++qfileqvectorqhashqdatastream

Data read from file takes way more memory than file size


I've written some data into a file the following way:

result = new QHash<QPair<int, int>, QVector<double> >;
QFile resfile("result.txt");
resfile.open(QIODevice::WriteOnly | QIODevice::Append);
QDataStream out(&resfile);
while(condition)
{
QString s=" something";
out<<s;
res->insert(QPair<int, int>(arange,trange),coeffs);
out<<res;
}

The file ended up with the equal to 484MB. After that i read it in a loop:

QString s;
QVector<QHash<QPair<int, int>, QVector <double> > >    thickeness_result;
QFile resfile("result.txt");
resfile.open(QIODevice::ReadOnly);
QDataStream out(&resfile);
while (!out.atEnd())
{
 thickeness_result.resize(thickeness_result.size()+1);
out>>s>>thickness_result.last();   
 }

While this read loop is running i see that in the task manager my program starts taking about 1300MB of memory and after that i receive an "In file text\qharfbuzzng.cpp, line 626: Out of memory" error. My question is: Is it normal that program starts taking more that 2x size of file memory and i should read it in chunks or am i doing something wrong?


Solution

  • WARNING All the following assumes that QVector behaves like std::vector

    Yes, it's normal. What's happening is that when you have 1024 elements, and want to read another one, the call to resize is allocating capacity for 2048 elements, moving the first 1024 elements in, and then constructing the 1025th element. It destroys the old array, and returns the memory to the heap (but not to the operating system). Then when you come to read the 2049th element, it does it all again, only this time allocating 4096 elements. The heap has a chunk of 1024 elements space, but that's no use when you want 4096. Now you have chunks of 1024, 2048, and 4096 elements in the heap (two of which are free and available for reuse).

    Repeat until you have read the file. You will see that you end up with (about) twice the file size.

    The first rule is "don't worry about it" - it usually isn't a problem. However, for you it clearly is.

    Can you switch to a 64-bit program? That should make the problem go away.

    The other option is to guess how many elements you have (from the file size) and call .reserve on the vector at the start.