Given the following snippet:
#include <iostream>
#include <sstream>
int main()
{
std::stringstream str;
str.put('a');
str.put('\x80');
str.put('a');
str.ignore(32, '\x80'); // hangs
std::cout << str.tellg() << "\n";
}
If compiled with gcc, the marked line hangs, assembly step through indicates an infinite loop. I tried GCC 5.4, 6.3, 8.2, 9.2, on different OSes, the result is the same. On wandbox, also tried clang (that is probably coming with libc++ instead of libstdc++), it terminates fine.
It only happens if the second argument of ignore is a character with MSB set, and if there's at least one character before and after in the stream. Is this an error in libstdc++, or does the standard prohibit non-ascii delimiters?
There's no signed overflow, but the problem is related to signed vs unsigned.
The internals of istream::ignore(n, delim)
use streambuf::sgetc()
to check the next character and compare it to the delimiter delim
. sgetc()
returns -1 for EOF or a non-negative value otherwise. That means when it reaches the '\x80'
character sgetc()
returns (int)(unsigned char)'\x80'
which is 128 and never compares equal to delim
, which is -128.
Looping forever is a bug in GCC's libstdc++, but the code in the question is not expected to work. For a char with a negative value, you should use std::char_traits<char>::to_int_type('\x80')
or just (unsigned char)'\x80'
to convert it to a non-negative value of the stream's int_type
.