snowflake-cloud-data-platform

How can I group by a case column that uses a JSON object?


I've tried the following

select payload:col1 as col1
       , case
            when contains(payload:col2, 'aaa') then 'type1'
            else 'other'
         end as type
       , count(*) as c
  from table1
  group by col1, type
  ;

where payload is a JSON/object column.

This results in a SQL compilation error: error line 3 at position 26 'table1.PAYLOAD' in select clause is neither an aggregate nor in the group by clause.

As far I understand I should be able to group by the case column type and payload is not really in any output column (there is case that uses payload but payload itself is not)


Solution

  • In the SELECT list of a GROUP BY query you can reference "anything" inside an aggregation expression, but only expressions from the GROUP BY list outside an aggregation expression.

    In your case, I believe (untested) that you should not have gotten an error if the GROUP BY contained payload:col2 or contains(payload:col2, 'aaa') or the whole "type" column expression. Only the last two would give a correct result, however...

    The most compact GROUP BY expression that will work I think will be:

    GROUP BY col1, type, contains(payload:col2, 'aaa')
    

    With that your SELECT list expressions only references GROUP BY expressions from the table expression of the query.

    EDIT: Too low reputation to comment );

    @ecerulm No, I meant the whole "type" column expression, ie:

    GROUP BY col1, type,
         case
            when contains(payload:col2, 'aaa') then 'type1'
            else 'other'
         end
    

    In fact it is explained exactly how this should work in the SQL standard, so the error message you receive is according to standard.