I'm confused about the behaviour of unrecognised preprocessor directives according to specifically the C90 standard.
In C90, a group-part is defined as either (with question marks used to indicate its optional):
pp-tokens?, new-line
if-section
control-line
Note that non-directives and text-lines are not included, like they are in the latest standard, C23.
pp-tokens is defined as one or more preprocessing tokens, which include operators and punctuators, which both include '#'.
Similarly, if-section and control-line both also start with a '#' character.
So if a line is encountered like # hi
, which can't be interpreted as an if-section or control-line, could a C90 conforming preprocessor then instead interpret it as, for example:
punctuator/operator '#', identifier "hi"
in the same way it would interpret the line + hi
or ; hi
, or even @ hi
as:
operator '+', identiifer "hi"
punctuator ';', identifier "hi"
other '@', identifier "hi"
not treating it like a preprocessor directive, and instead including it in the final preprocessor output, even if it appears to be a preprocessing directive?
The context is I'm trying to make my own C90 preprocessor, and want to better understand how unrecognised preprocessing directives are required to be handled according to the standard.
In practice, this isn't what gcc (15.1.0) or clang (19.1.7) do. With -std=c90 -pedantic -E, they both give an "invalid preprocessing directive" error, which makes sense. But if they or another compiler decided to just include the unrecognised directive in the final output like I showed above, would that be okay (or undefined behaviour), or is this explicitly/implicitly disallowed somewhere in the standard and the implementation should give a warning/error message and/or stop preprocessing?
Here's an example program:
int main(void) {
# hi
return 0;
}
gcc output:
# 0 "test.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "test.c"
int main(void)
{test.c:3:11: error: invalid preprocessing directive #hi
3 | # hi
| ^~
return 0;
}
Note that although it appears to continue preprocessing despite the error, if I try specifying an output file with -o, it doesn't create the file and only prints the error.
And another example, this time using @:
int main(void) {
@ hi
return 0;
}
gcc output:
# 0 "test.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "test.c"
int main(void)
{
@ hi
return 0;
}
This question is similar, but the only answers it has received are for either c++, or versions of C beyond C90 (for example one answer mentions non-directives, however this doesn't seem to be a thing in the C90 draft I'm referencing).
From https://port70.net/~nsz/c/c89/c89-draft.html :
If a ``shall'' or ``shall not'' requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this Standard by the words ``undefined behavior'' or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ``behavior that is undefined.''
I would argue that C89 does not explicitly specify the behavior, so the behavior is undefined.
This was clarified with defect report 448 https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_448.htm in newer standard versions.
How does C90 treat non-directives / unrecognised preprocessing directives?
It doesn't.
would that be okay (or undefined behaviour),
Yes. Undefined behavior is not defined, so anything is ok.
is this explicitly/implicitly disallowed somewhere in the standard and the implementation should give a warning/error message and/or stop preprocessing?
No.