I believe I've come across an ambiguous case in the C standard regarding the scope of objects declared in the first clause of a for
loop compared to objects declared in the loop body.
Given the following code:
for (int i = 1; i < 5; i++) {
int i = 2;
printf("%d ", i);
}
At first glance, it seems that the declaration of i
inside of the loop body is at an inner scope compared to the i
declared in the first clause of the for
loop.
Section 6.2.1p4 of the C23 standard states the following regarding block scope identifiers:
If the declarator or type specifier that declares the identifier appears inside a block or within the list of parameter declarations in a function definition, the identifier has block scope, which terminates at the end of the associated block
And 6.2.2p6 regarding linkage:
The following identifiers have no linkage: an identifier declared to be anything other than an object or a function; an identifier declared to be a function parameter; a block scope identifier for an object declared without the storage-class specifier
extern
.
So both have block scope and no linkage.
However, looking at 6.2.1p6 which says the following regarding comparing scopes:
Two identifiers have the same scope if and only if their scopes terminate at the same point.
And 6.8.5.3p1 regarding the for
statement says the following regarding the scope of identifiers it declares:
If clause-1 is a declaration, the scope of any identifiers it declares is the remainder of the declaration and the entire loop, including the other two expressions; it is reached in the order of execution before the first evaluation of the controlling expression
Putting these together, it appears that both objects declared as i
have scopes that end at the same point, namely the closing brace of the for
loop (i.e. the end of "the entire loop"), and therefore have the same scope, and additionally both have no linkage.
This would then be a constraint violation as per 6.7p3:
If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type specifier) with the same scope and in the same name space.
But no compiler I've tested with displays any diagnostic.
Intuitively, it seems this should be allowed.
Is this a defect in the standard, or am I missing something? Is there a specific reason that scope matching only checks the end but not the beginning? The language is the same going back to C99 when declarations in for
loops were introduced.
NOTE: this is a major rewrite of the initial version of this answer.
At first glance, it seems that the declaration of
i
inside of the loop body is at an inner scope compared to thei
declared in the first clause of thefor
loop.
And it is.
[...] So both have block scope and no linkage
You have quoted the relevant sections of the spec. I don't think they leave any room for doubt that both i
s have block scope and no linkage.
Putting these together, it appears that both objects declared as i have scopes that end at the same point, namely the closing brace of the for loop (i.e. the end of "the entire loop"), and therefore have the same scope
Sort of. It is important to recognize that in the language of the spec, "block" does not mean only a brace-enclosed compound statement. Per C23 6.8.1p3,
A block is either a primary block, a secondary block, or the block associated with a function definition; it allows a set of declarations and statements to be grouped into one syntactic unit.
Primary blocks include not only compound statements but also selection statements (if
and switch
) and iteration statements (for
, while
, do
... while
). Secondary blocks include all labeled and unlabeled statements -- not only the above, but also expression statements and jump statements (break
, continue
, goto
, return
). (6.8.1p1) Overall, "block" is a pretty broad syntactic category.
6.8.5.3p1 does say that the scope of the outer i
includes "the entire loop", which is reasonably interpreted as "the entire for
statement". That understanding is helpful when we consider the next bit of 6.8.1p3:
Whenever a block B appears in the syntax production as part of the definition of an enclosing block A, scopes of identifiers and lifetimes of objects that are associated with B do not extend to the parts of A that are outside of B.
That speaks directly to the question. In the example code, the overall for
statement has the role of A, and the compound statement serving as loop body has the role of B. The scope of the inner i
ends at the end of B, the loop body, whereas the scope of the outer i
continues past that to the end of A, the overall for
statement. The tricky point here is that that is primarily a semantic distinction in the example case, not a syntactic one.
And it is the semantic distinction that is most important. In particular, the fact that the outer i
retains its value across loop iterations reflects that it does not go out of scope for the whole execution of the loop, whereas the fact that the inner i
does not retain its value, or even live, from one iteration to the next reflects its narrower scope.
If you want to overlay a syntactic distinction on that, however, then one way to do so would be to interpret the scope of the inner i
to exclude the closing brace of the loop body, and the scope of the outer i
to include it. That gives you a difference in scope that can be expressed in terms of distinguishable regions of source code.
Consider also this code fragment, which has similar structure, but is less controversial:
{
int i = 1;
{
int i = 2;
printf("%d ", i);
}
}
Note that there are no statements between the end of the scope of the inner i
and the end of the scope of the outer i
, yet they are still different scopes. This supports the idea that scope is about the logical structure of the program, not necessarily measurable in terms of execution steps.
Is this a defect in the standard, or am I missing something?
I suspect that you are missing the same thing that I initially did, that the term "block" must be understood pretty expansively. You may also be thinking in terms of the lexical structure of the code, whereas in C, scope must be understood at a grammatical / semantic level.
Is there a specific reason that scope matching only checks the end but not the beginning?
The C99 rationale spends some time on scope, but it does not address that particular question. Although the language defining "same scope" was new in C99, the general concept was carried over from C90, where the issue you ask about does not arise. C90 focused on where scope ends, not where it begins.
The C99 wording has been carried forward substantially unchanged into C23. I have always thought that it was a bit fraught, but I am reasonably confident that even from before C90, the idea was to be inclusive of all identifiers, including those (with internal or external linkage) that are declared multiple times in the same scope. Some declarations start the scope of the declared identifier whereas others do not, but the point where the scope of an identifier ends is clear. And that endpoint captures the desired idea.