This minimal example is supposed to extract year from time stamps, then count something in a given year.
SELECT EXTRACT(YEAR FROM rental_ts) as year,
COUNT(DISTINCT rental_id)
FROM rental
GROUP BY year
HAVING year=2020
Running it, I get an error column "year" does not exist
. What is the reason for this?
HAVING EXTRACT(YEAR FROM rental_ts)=2020
works without problems, but not very convenient.year
in WHERE
clause instead.Alas, that is true. Column aliases are not allowed. One solution is to repeat the expression:
SELECT EXTRACT(YEAR FROM rental_ts) as year,
COUNT(DISTINCT rental_id)
FROM rental
GROUP BY year
HAVING EXTRACT(YEAR FROM rental_ts) = 2020;
A better solution is to filter before aggregating:
SELECT EXTRACT(YEAR FROM rental_ts) as year,
COUNT(DISTINCT rental_id)
FROM rental
WHERE rental_ts >= '2020-01-01' AND rental_ts < '2021-01-01'
GROUP BY year;
This is better for two reasons. First, it is index (and partition) compatible. Second, it reduces the amount of data needed for the aggregation.