ansi-sqlsql

Is it really necessary to have GROUP BY in the SQL standard


After writing SQL for a few years, I find it often annoying having to put the columns I am interested in SELECT and then specify them again in GROUP BY. I can't help thinking, why do we have to do that?

What is the reason that user has to be specific about which column to group by? Can't we just let the SQL engine assume if there is an aggregate function in SELECT, group by the rest non-aggregate columns?

This will be especially helpful and more concise when you have a large CASE WHEN in SELECT.


Solution

  • Because they may not always match exactly.

    For example, If I want find out the maximum number of books per category, I could do:

    select max(cnt)
    from (
        select count(*) as cnt
        from books
        group by category
        ) t;
    

    In some DBs such as Oracle, you can even do this:

    select max(count(*))
    from books
    group by category;
    

    I don't really need to specify the category column as I don't need it.

    A few databases such as Postgres support the use of aliases in the group by clause.