sqlgoogle-bigqueryoperatorsgrammararity

Arity of BETWEEN expression


What is the arity of the sql BETWEEN expression? I thought it was three (ternary) since the expression usually looks like:

WHERE...
   1 BETWEEN 2 AND 3

But it's listed as binary on BigQuery's documentation, and I assume other places as well.

enter image description here

Source: Operators.

What is the arity of the BETWEEN expression and why? I think the answer is 3 from the following example:

select 
  ~ (SELECT -1 AS expr_1)                              AS 'bitwise_arity_1',
    (SELECT 1 AS expr_1) * (SELECT 2 AS expr_2)        AS 'times_arity_2',
    (SELECT 1 AS expr_1) BETWEEN 
      (SELECT 2 AS expr_2) AND (SELECT 3 AS expr_3)    AS 'bitwise_arity_3?'

I suppose one way to interpret it might just be that the grammar is:

expr 'BETWEEN' logicalAndExpr

And so the two expressions in the logicalAnd are just grouped into one. Is that a correct understanding?

SQLFiddle: http://www.sqlfiddle.com/#!9/b28da2/2156


Solution

  • It's binary, in syntactic terms. See below for a discussion of syntax vs. semantics, where I note that a better syntactic term is "infix".

    Similarly, function calls and array subscripting are postfix unary operators and the C family's conditional operator (often misnamed "the ternary operator" as though it were the only such thing) is also infix. The reason is that the interior operands (the operands between BETWEEN...AND, (...), [...], and ?...:, respectively) are fenced off from the rest of the syntax by the pair of surrounding terminal tokens which function as a syntactic barrier, like parentheses. Precedence does not penetrate to the enclosed operands; only the outer operand(s) remain floating in the syntax.

    The semantic view is quite different, of course. BETWEEN...AND and ?...: are certainly three-argument functions, although since the latter is short-circuiting, only two of the three arguments are ever evaluated, which makes it hard to discuss in strict mathematical terms [Note 1]. Moreover, the semantic view is complicated by the fact that there is not just a single way to look at what an argument is. As noted in a comment, you can always curry functions into a series of unary applications of higher-order functions. Although you might be tempted to try to redefine "arity" as the length of that sequence, you will soon find higher-order functions which have different sequence lengths depending on the values of their arguments. Also, in most programming languages (unlike SQL) the function being called is a full expression which does not need to be evaluated at compile-time, and since different functions have different argument counts, there is no good way to describe the arity of a function call unless you respecify the call to be the application of a list-of-arguments object to a callable object. That's often done, but it's a bit unsatisfying because (in most languages), the list object does not really exist and cannot be observed as an object.

    I'd suggest taking the Wikipedia article on arity with a good-sized saline dosage, because it completely misses the distinction between semantics and syntactic structure, giving rise to the confusing ambiguity between the semantic and syntactic view of SQL's range operator or C's conditional operator. Personally, I prefer to reserve "arity" for the semantic meaning, using "fixity" or "valence" for the syntactic feature. (The advantage of "fixity" is that it encourages the distinction between prefix and postfix, which is a real distinction hidden by calling both cases "unary operators".)

    Notes

    1. BETWEEN...AND could short-circuit, too, but standard SQL doesn't guarantee short-circuiting, as far as I know (although some SQL implementations do.)