sqlpostgresqlcountgroup-byaggregate-filter

Conditional SQL count


What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column?
My database is PostgreSQL.

I have seen:

SELECT
    sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,
    sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,
    sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,
    category
FROM reviews
    GROUP BY category

where question1 can have a value of either 0, 1 or 2.

I have also seen a version of that using count(CASE WHEN question1 = 0 THEN 1)

However, this becomes more cumbersome to write as the number of possible values for question1 increases. Is there a convenient way to write this query, possibly optimizing performance?


Solution

  • In Postgres 9.4 or later, use the aggregate FILTER option. Typically cleanest and fastest:

    SELECT category
         , count(*) FILTER (WHERE question1 = 0) AS zero
         , count(*) FILTER (WHERE question1 = 1) AS one
         , count(*) FILTER (WHERE question1 = 2) AS two
    FROM   reviews
    GROUP  BY 1;
    

    Details for the FILTER clause:

    If you want it short:

    SELECT category
         , count(question1 = 0 OR NULL) AS zero
         , count(question1 = 1 OR NULL) AS one
         , count(question1 = 2 OR NULL) AS two
    FROM   reviews
    GROUP  BY 1;
    

    More syntax variants:

    Proper crosstab query

    crosstab() yields the best performance and is shorter for long lists of options:

    SELECT * FROM crosstab(
         'SELECT category, question1, count(*) AS ct
          FROM   reviews
          GROUP  BY 1, 2
          ORDER  BY 1, 2'
       , 'VALUES (0), (1), (2)'
       ) AS ct (category text, zero int, one int, two int);
    

    Detailed explanation: