ungetc()
seems to fail on some characters. Here is a simple test program:
#include <stdio.h>
int main(void) {
int c;
printf("Type a letter and the enter key: ");
#define TRACE(x) printf("%s -> %d\n", #x, x)
TRACE(c = getc(stdin));
TRACE(ungetc(c, stdin));
TRACE(getc(stdin));
TRACE(ungetc('\xFE', stdin));
TRACE(getc(stdin));
TRACE(ungetc('\xFF', stdin));
TRACE(getc(stdin));
return 0;
}
I run it on a unix system and type a
Enter at the prompt
The output is:
Type a letter and the enter key: a
c = getc(stdin) -> 97
ungetc(c, stdin) -> 97
getc(stdin) -> 97
ungetc('\xFE', stdin) -> 254
getc(stdin) -> 254
ungetc('\xFF', stdin) -> -1
getc(stdin) -> 10
I expected this:
Type a letter and the enter key: a
c = getc(stdin) -> 97
ungetc(c, stdin) -> 97
getc(stdin) -> 97
ungetc('\xFE', stdin) -> 254
getc(stdin) -> 254
ungetc('\xFF', stdin) -> 255
getc(stdin) -> 255
Why is causing ungetc()
to fail?
EDIT: to make things worse, I tested the same code on a different unix system, and it behaves as expected there. Is there some kind of undefined behavior?
Working on the following assumptions:
'\xFF'
is -1
on your system (the value of out-of-range character constants is implementation-defined, see below).EOF
is -1
on your system.The call ungetc('\xFF', stdin);
is the same as ungetc(EOF, stdin);
whose behaviour is covered by C11 7.21.7.10/4:
If the value of
c
equals that of the macroEOF
, the operation fails and the input stream is unchanged.
The input range for ungetc
is the same as the output range of getchar
, i.e. EOF
which is negative, or a non-negative value representing a character (with negative characters being represented by their conversion to unsigned char
). I presume you were going for ungetc(255, stdin);
.
Regarding the value of '\xFF'
, see C11 6.4.4.4/10:
The value of an integer character constant [...] containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.
Also, the values of the execution character set are implementation-defined (C11 5.2.1/1). You could check the compiler documentation to be sure, but the compiler behaviour suggests that 255
is not in the execution character set; and in fact the behaviour of a gcc version I tested suggests that it takes the range of char
as the execution character set (not the range of unsigned char
).