cinitializationc99c23

Using previously set fields when initializing structure with compound literals


What is the expected behavior of referring to previously set fields, when declaring initial values for a struct using compound literal: Specifically is it ok to: struct foo v = { .v1 = ..., .v2 = .v1+1 };

More complete example:

struct s {
    int i1 ;
    int i2 ;
} ;

void func(void)
{
     // Case A:
     struct s v1 = { 3 } ;
     v1 = (struct s) ( .i1 = 4, .i2 = v1.i1+10 }
     // always v1.i2=13, NOT 14

     // Case B:
     // Is this legal ?
     struct s v2 = ( .i1 = 1, .i2 = v2.i1+10 }
     // Under GCC, v2.i2 = 11

}

In case A - the standard is clear that "temporary" compound literal is created, based on the value of v1 before the assignment (v1.i1=3), resulting in v2.i2 = 13. The explanation is that this is basically:

     struct s v1 = { 3 } ;
     struct s temp = ( .i1 = 4, .i2 = v1.i1+10 }
     v1 = temp ;

For case B, at least with GCC, it looks as if it's OK to reference previously set values in the SAME statement - in this case i2 is set to 11, implying that this was NOT implemented using temporary variable which would have results in v2.i2 = 10;

     // NOT the same as struct v1 = { ... };
     struct s temp = ( .i1 = 4, .i2 = v1.i1+10 }
     struct s v1 = temp ;

My question: what does the C99/C23 standard says about the second case ? is it required behavior, implementation specific behavior or undefined behavior which compilers should flag as warning/error ?

I tried the above with GCC9 (-Wall, -Wextra), also with Coverity - no issue was raises.


Solution

  • The C standard says nothing about which elements or members of aggregates are initialized before any initializers are evaluated.

    The standard contains wording about the order in which elements or members are initialized (in C 2023 draft N3096 6.7.10). However, at best, this order only tells us the order in which values are stored in the elements or members. It says nothing about when those values were prepared. 6.7.10 24 says:

    The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.

    That wording first appeared in C 2011 6.7.9 23 (determined from draft N1570). (The fact it was absent prior to that does not mean there was sequencing; it means the C standard was silent on it, and hence did not assert anything about sequencing.) This wording confirms the expressions are not necessarily evaluated in the same order that elements or members are initialized, because that would impose an order on them.

    Thus, in struct s v2 = ( .i1 = 1, .i2 = v2.i1+10 }, the compiler is free to generate a program that operates in this order:

    1. Evaluate 1.
    2. Evaluate v2.i1+10.
    3. Initialize .i1 to result of step 1.
    4. Initialize .i2 to result of step 2.

    In this order, v2.i1 is not initialized at the time it is evaluated for step 2.

    Other permissible orders include 2 1 3 4 and 1 3 2 4, as well as reorderings separating v2.i1+10 into evaluations of v2.i1, 10, and the addition.

    The primary purpose specifying initialization order serves is to identify which elements or members are initialized when no designations are present. That is, when parsing the code, the compiler takes the initializers to correspond to elements or members in the order they appear in the aggregate. A secondary purpose appears in C 2020 draft N3096 6.7.10 20:

    The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject;…

    This tells us which initializer is used if an element or member appears multiple times in the initialization but again nothing about the order in which the initializers are evaluated.

    Supplementary Discussion

    I suspect the discussion of “order” in C 2023 6.7.10 was motivated solely to specify the interpretation of which initializers applied to which elements or members and was not intended to have any effect on program state in a running program. The only way I see it might affect observable program behavior (aside from attempting to reference previously initialized elements or members) is if there are volatile objects being initialized. This would raise some problems. Is initialization an access to an object for the purpose of volatile? If an element or member appears twice in the initialization, does the fact the second appearance “overrides” the first mean the first is suppressed and never occurs or that it occurs but is later overwritten?