clanguage-lawyerc99

Is recursively calling main from its own parameters (abusing sizeof with VLAs) standard C99?


Disclaimer: my question isn't practical at all, it's more a question about 2 codes which allegedly abuse the rules and somewhat compile (with more or less warnings / errors according to the compiler).

As far as I know, the following code is valid C99 :

#include <stdio.h>
#include <stddef.h>

int main(void) {
    size_t n = sizeof(int[printf("%s", "Hello")]);
}

I understand that sizeof doesn't evaluate an expression unless it is a VLA (because it has non-constant size).

However, I'm not sure if this other piece of code is valid C99 :

#include <stdio.h>

int main(int argc, char** argv);
int main(int argc, char* argv[sizeof(int[printf("%s\n", "Hello") + main(0, NULL)])]) {}

Run it on Compiler Explorer

I guess making an array of size sizeof(VLA) is valid, I know no rule which would make this invalid.
If I remember correctly, char*[some size] as function parameter decays to char**, but I don't know if it's ill-formed because the declaration/definition don't perfectly match.
And I have no idea if recursively call main from its parameters is valid or not, I obviously didn't find any resource or example of a code like this. Also, does main have different rules than other functions about recursive calls in the parameters ?
I guess the compiler may not optimize it out (I guess it's not an UB), so I also think a stack overflow will always happen.

Quotes from the standard would be very appreciated, thanks a lot.


Solution

    • Is it valid to have an array of size sizeof(some VLA) ?

    Yes.

    The primary relevant provisions of C99 were:


    [...] the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero. [...]

    An ordinary identifier (as defined in 6.2.3) that has a variably modified type shall have either block scope and no linkage or function prototype scope. If an identifier is declared to be an object with static storage duration, it shall not have a variable length array type.

    (C99 6.7.5.2/1-2)


    Note that the only constraint on the makeup of the expression designating the array size, if present, is that it have integer type. The usage context places no other requirements on the operators or operands involved.


    If the size is an integer constant expression and the element type has a known constant size, the array type is not a variable length array type; otherwise, the array type is a variable length array type.

    If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by *; otherwise, each time it is evaluated it shall have a value greater than zero. The size of each instance of a variable length array type does not change during its lifetime. Where a size expression is part of the operand of a sizeof operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated.

    (C99 6.7.5.2/4-5)


    Note that the last sentence of that does not apply to your particular case; I include it for completeness, since it does involve mixing sizeof and VLAs.

    The only additional requirement here is that the value be greater than 0, but that is always true of sizeof expressions (there are no zero-size objects).

    • Is it valid to have a declaration like int main(int, char**) and a definition like int main(int, char*[some size]) ?

    Yes.

    The primary relevant provisions of C99 were:


    All declarations in the same scope that refer to the same object or function shall specify compatible types.

    (C99 6.7/4)


    Note that multiple declarations of the same function are contemplated.


    For two function types to be compatible, both shall specify compatible return types. Moreover, the parameter type lists, if both are present, shall agree in the number of parameters and in use of the ellipsis terminator; corresponding parameters shall have compatible types. [...] (In the determination of type compatibility and of a composite type, each parameter declared with function or array type is taken as having the adjusted type and each parameter declared with qualified type is taken as having the unqualified version of its declared type.)

    (C99 6.7.5.3/15)


    Note in particular that the comparison of parameter types uses the adjusted type. This refers to


    A declaration of a parameter as ''array of type'' shall be adjusted to ''qualified pointer to type'', where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation.

    (C99 6.7.5.3/7)


    That adjustment moots the difference between your char** and char*[some size] where they appear as the types of function parameters.

    • And is it valid to recursively call a function from its own parameters ?

    It depends.

    C99 does not allow calls to undeclared functions (though many compilers accept these as a backwards-compatibility extension), and a function identifier is not in scope until immediately after its declaration is complete. Thus an expression in a function declaration cannot call that function unless there is another declaration of the same function preceding it. That usually is not the case, but your example takes care to provide for it.

    Otherwise, as already discussed, the only limitation on the form of the size expression in a VLA is that it have integer type. This is not limited by the context in which the VLA declaration appears.

    • Is it guaranteed the recursion will happen or the compiler has the right to optimize it out ?

    No.

    The primary provision here is


    The least requirements on a conforming implementation are:

    [...]

    [...]

    (C99 5.1.2.3/5)


    This is the primary provision about what optimizations may be performed overall. Supposing we interpret printf() to produce output to a file (and that is the usual interpretation), it is not allowed to optimize away printf calls that emit any output at all. However, the compiler is permitted to transform that from a recursion to a loop, for example.

    If the recursion will happen, is it guaranteed there will be a stack overflow ?

    No.

    Even if the compiler implements it as a recursion, the language spec says nothing about stack or stack overflow. This is firmly in the area of implementation detail.

    Moreover, even on an implementation that uses a call stack and therefore is conceivably susceptible to stack overflow, the compiler can avoid that in this particular case by observing that it is tail recursive, and therefore that each recursive call can reuse the same stack frame.