How is the conditional operator parsed?

So, the cppreference claims:

The expression in the middle of the conditional operator (between ? and :) is parsed as if parenthesized: its precedence relative to ?: is ignored.

However, it appears to me that the part of the expression after the ':' operator is also parsed as if it were between parentheses. I've tried to implement the ternary operator in my programming language (and you can see the results of parsing expressions here, and my parser pretends that the part of the expression after ':' is also parenthesized. For example, for the expression (1?1:0?2:0)-1, the interpreter for my programming language outputs 0, and this appears to be compatible with C. For instance, the C program:

#include <stdio.h>

int main() {
    printf("%d\n",(1?1:0?2:0)-1);
}

Outputs 0.

Had I programmed the parser of my programming language that, when parsing the ternary operators, simply take the first already parsed node after ':' and take it as the third operand to '?:', it would output the same as ((1?1:0)?2:0)-1, that is 1.

My question is whether this would (pretending that the expression after the ':' is parenthesized) always be compatible with C?

Solution

"Pretends that it is parenthesised" is some kind of description of operator parenthesis. But of course that has to be interpreted relative to precedence relations (including associativity). So in a-b*c and a*b-c, the subtraction effectively acts as though its arguments are parenthesised, only the left-hand argument is treated that way in a-b-c and it is the comparison operator which causes grouping in a<b-c and a-b<c.

I'm sure you know all that since your parser seems to work for all these cases, but I say that because the ternary operator is right-associative and of lower precedence than any other operator [Note 1]. That means that the pseudo-parentheses imposed by operator precedence surround the right-hand argument (regardless of its dominating operator, since all operators have higher precedence), and also the left-hand argument unless its dominating operator is another conditional operator. But that wouldn't be the case in C, where the comma operator has lower precedence and would not be enclosed by the imaginary parentheses following the :.

It's important to understand what is meant by the precedence of a complex operator. In effect, to compute the precedence relations we first collapse the operator to a simple ?: which includes the enclosed (second) argument. This is not "as if the expression were parenthesized", because it is parenthesized. It is parenthesized between ? and :, which in this context are syntactically parenthetic.

In this sense, it is very similar to the usual analysis of the subscript operator as a postfix operator, although the brackets of the subscript operator enclose a second argument. The precedence of the subscript operator is logically what would result from considering it to be a single [], abstracting away the expression contained inside. This is also the same as the function call operator. That happens to be written with parentheses, but the precise symbols are not important: it is possible to imagine an alternative language in which function calls are written with different symbols, perhaps { and }. That wouldn't affect the grammar at all.

It might seem odd to think of ? and : to be "parenthetic", since they don't look parenthetic. But a parser doesn't see the shapes of the symbols. It is satisfied by being told that a ( is closed by a ) and, in this case, that a ? is closed by a :. [Note 2]

Having said all that, I tried your compiler on the conditional expression

d = 0 ? 0 : n / d

It parses this expression correctly, but the compiled code computes n / d before verifying whether d = 0 is true. That's not the way the conditional operator should work; in this case, it will lead to an unexpected divide by 0 exception. The conditional operator must first evaluate its left-hand argument, and then evaluate exactly one of the other two expressions.

Notes:

In C, this is not quite correct. The comma operator has lower precedence, and there is a more complex interaction with assignment operators, which logically have the same precedence and are also right-associative.
In C-like languages those symbols are not used for any other purpose, so it's OK to just regard them as strange-looking parentheses and leave it at that. But as the case of the function-call operator shows (or, for that matter, the unary - operator), it is sometimes possible to reuse operator symbols for more than one purpose.

As a curiosity, it is not strictly necessary that open and close parentheses be different symbols, as long as they are not used for any other purpose. So, for example, if | is not used as an operator symbol (as it is in C), then you could use | a | to mean the absolute value of a without creating any ambiguities.

A precise analysis of the circumstances in which symbol reuse leads to actual ambiguities is beyond the scope of this answer.