What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column?
My database is PostgreSQL.
I have seen:
SELECT
sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,
sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,
sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,
category
FROM reviews
GROUP BY category
where question1
can have a value of either 0, 1 or 2.
I have also seen a version of that using count(CASE WHEN question1 = 0 THEN 1)
However, this becomes more cumbersome to write as the number of possible values for question1
increases. Is there a convenient way to write this query, possibly optimizing performance?
In Postgres 9.4 or later, use the aggregate FILTER
option. Typically cleanest and fastest:
SELECT category
, count(*) FILTER (WHERE question1 = 0) AS zero
, count(*) FILTER (WHERE question1 = 1) AS one
, count(*) FILTER (WHERE question1 = 2) AS two
FROM reviews
GROUP BY 1;
Details for the FILTER
clause:
If you want it short:
SELECT category
, count(question1 = 0 OR NULL) AS zero
, count(question1 = 1 OR NULL) AS one
, count(question1 = 2 OR NULL) AS two
FROM reviews
GROUP BY 1;
More syntax variants:
crosstab()
yields the best performance and is shorter for long lists of options:
SELECT * FROM crosstab(
'SELECT category, question1, count(*) AS ct
FROM reviews
GROUP BY 1, 2
ORDER BY 1, 2'
, 'VALUES (0), (1), (2)'
) AS ct (category text, zero int, one int, two int);
Detailed explanation: