perlexpressionoperator-precedenceassociativity

Is the comma in Perl associative in all contexts?


Suppose E, F, and G are expressions that don't involve operators of higher precedence than the comma. Are the expressions ((E, F), G) and (E, (F, G)) equivalent in scalar, list, and/or void contexts? More precisely, can you always replace ((E, F), G) with (E, (F, G)) and vice-versa without affecting program execution?


Solution

  • Context does not affect how code is parsed.

    The comma/list operator is guaranteed to evaluate each operand from left to right (regardless of context).

    From that and a couple more pieces of information, we can prove that E, F, G, ( E, F ), G and E, ( F, G ) are equivalent.[1]


    Since you specifically mentioned context, we'll look at that in more detail.

    For void, list or indeterminate[2] context c, we get the same context c for each item in all cases.

       c               c                 c
    -------       -----------       -----------
    c  c  c          c      c       c     c
    -  -  -       --------  -       -  --------
                    c  c                 c  c
                    -  -                 -  -
    E, F, G       ( E, F ), G       E, ( F, G )
    

    For scalar context (s), we get void context (v) for all but the last item in all cases.

       s               s                 s
    -------       -----------       -----------
    v  v  s          v      s       v     s
    -  -  -       --------  -       -  --------
                    v  v                 v  s
                    -  -                 -  -
    E, F, G       ( E, F ), G       E, ( F, G )
    

    1. I'd love to be able to say they compile identically, but they don't. Despite the documentation saying it's a binary operator, it's implemented as a n-ary operator. The parens causes another instance of the operator to be created (effectively making it a binary operator in your examples).

    2. When it's the last expression of a sub, the context is only known at run-time, which I called "indeterminate". This makes no difference except for the the list/comma operator in scalar context. The run-time context is propagated to every operand, so you get s,s,s if it's only know at run-time, despite getting v,v,s if the context is known at compile-time.

      As you can see above, this doesn't affect the answer. E, F, G, ( E, F ), G and E, ( F, G ) are equivalent whether the context is known at compile-time or not.