My company's products run on a number of qualified Linux hardware/software configurations. Historically, the compiler used has been GNU C++. For purposes of this post, let's consider version 3.2.3 the baseline, as our software 'worked as expected' through that version.
With the introduction of a newer qualified platform, using GNU C++ version 3.4.4, we began to observe some performance problems which we had not seen before. After some digging, one of our engineers came up with this test program:
#include <fstream>
#include <iostream>
using namespace std;
class my_filebuf : public filebuf
{
public:
my_filebuf() : filebuf(), d_underflows(0) {};
virtual ~my_filebuf() {};
virtual pos_type seekoff(off_type, ios_base::seekdir,
ios_base::openmode mode = ios_base::in | ios_base::out);
virtual int_type underflow();
public:
unsigned int d_underflows;
};
filebuf::pos_type my_filebuf::seekoff(
off_type off,
ios_base::seekdir way,
ios_base::openmode mode
)
{
return filebuf::seekoff(off, way, mode);
}
filebuf::int_type my_filebuf::underflow()
{
d_underflows++;
return filebuf::underflow();
}
int main()
{
my_filebuf fb;
fb.open("log", ios_base::in);
if (!fb.is_open())
{
cerr << "need log file" << endl;
return 1;
}
int count = 0;
streampos pos = EOF;
while (fb.sbumpc() != EOF)
{
count++;
// calling pubseekoff(0, ios::cur) *forces* underflow
pos = fb.pubseekoff(0, ios::cur);
}
cerr << "pos=" << pos << endl;
cerr << "read chars=" << count << endl;
cerr << "underflows=" << fb.d_underflows << endl;
return 0;
}
We ran it against a log file of approximately 751KB chars. In the previous configurations, we got the result:
$ buftest
pos=768058
read chars=768058
underflows=0
In the newer version, the result is:
$ buftest
pos=768058
read chars=768058
underflows=768059
Comment out the pubseekoff(0, ios::cur) call and the excessive underflow() calls go away. So clearly, in newer versions of g++, calling pubseekoff() 'invalidates' the buffer, forcing a call to underflow().
I've read the standards document, and the verbiage on pubseekoff() is certainly ambiguous. What is the relationship of the underlying file pointer position to that of gptr(), for instance? Before or after a call to underflow()? Regardless of this, I find it irritating that g++ 'changed horses in midstream', so to speak. Moreover, even if a general seekoff() required invalidating the buffer pointers, why should the equivalent of ftell()?
Can anyone point me to a discussion thread amongst the implementors which led up to this change in behavior? Do you have a succinct description of the choices and tradeoffs involved?
Clearly I really don't know what I'm doing. I was experimenting to determine if there was a way, however non portable, to bypass the invalidation in the case where offset is 0 and seekdir is ios::cur. I came up with the following hack, directly accessing the filebuf data member _M_file (this only wanted to compile with the 3.4.4 version on my machine):
int sc(0);
filebuf::pos_type my_filebuf::seekoff(
off_type off,
ios_base::seekdir way,
ios_base::openmode mode
)
{
if ((off == 0) && (way == ios::cur))
{
FILE *file =_M_file.file();
pos_type pos = pos_type(ftell(file));
sc++;
if ((sc % 100) == 0) {
cerr << "POS IS " << pos << endl;
}
return pos;
}
return filebuf::seekoff(off, way, mode);
}
However, the diagnostic to print out the position every hundred seekoff attempts yields 8192 every time. Huh? Since this is the FILE *
member of the filebuf itself, I'd expect it's file position pointer to be in synch with any underflow() calls made by the filebuf. Why am I wrong?
First, let me emphasize that I understand this later part of my post is all about non-portable hacks. Still, not understanding the nitty-gritty here. I tried calling
pos_type pos = _M_file.seekoff(0,ios::cur);
instead, and this happily progresses through the sample file, rather than getting stuck at 8192.
Internally to my company, we've made some workarounds that reduce the performance hit enough we can live with it.
Externally, David Krauss filed a bug against GNU's libstdc++ streams, and recently, Paolo Carlini checked in a fix. The consensus was that the undesired behavior was within the scope of the Standard, but that there was a reasonable fix for the edge case I described.
So thanks, StackOverflow, David Krauss, Paolo Carlini, and all the GNU developers!
The requirements of seekoff
certainly are confusing, but seekoff(0, ios::cur)
is supposed to be a special case that doesn't synchronize anything. So this could probably be considered a bug.
And it still happens in GCC 4.2.1 and 4.5…
The problem is that (0, ios::cur)
is not special-cased in _M_seek
, which seekoff
uses to call fseek
to obtain its return value. So long as that succeeds, _M_seek
unconditionally calls _M_set_buffer(-1);
, which predictably invalidates the internal buffer. The next read operation causes underflow
.
Found the diff! See change -473,41 +486,26
. Comment was
(seekoff): Simplify, set _M_reading, _M_writing to false, call
_M_set_buffer(-1) ('uncommitted').
So this wasn't done to fix a bug.
Filed bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45628