I have a large dataset with many different product. In a small scale it would look like this:
product | month | amount |
---|---|---|
AA | 1 | 100 |
AA | 1 | 150 |
AA | 2 | 200 |
AA | 2 | 120 |
BB | 2 | 180 |
BB | 2 | 220 |
CC | 3 | 80 |
I want get the info in a different order. Write as many new columns as distinct values from the column "product", then fill it with the sum of the amount by month. It would look like this:
month | AA | BB | CC |
---|---|---|---|
1 | 250 | NA | NA |
2 | 320 | 400 | NA |
3 | NA | NA | 80 |
The important thing is to PIVOT the table, that's the main problem I have. I saw similar questions but all have being solved with PIVOT function but I'm doing this with a Postgres database in DBeaver and it doesn't have the PIVOT function:
SELECT *
FROM (
SELECT product, month, amount
FROM ventas
) AS SourceTable
PIVOT (
SUM(amount)
FOR month IN ([1], [2], [3])
) AS PivotTable;
SQL Error [42601]: ERROR: syntax error at or near "PIVOT" Position: 95
I've tried different ways but no success. I can't write the names of all products in the query as there are too many!
Here is a test setup for your convenience:
CREATE TABLE ventas (
product VARCHAR(50),
month INT,
amount INT
);
INSERT INTO ventas (product, month, amount) VALUES
('AA', 1, 100),
('AA', 1, 150),
('AA', 2, 200),
('AA', 2, 120),
('BB', 2, 180),
('BB', 2, 220),
('CC', 3, 80),
If you actually need separate output columns, the only reasonable approach is a 2-step flow:
If you are not familiar with the crosstab()
function, read this first:
-- generate crosstab() query
SELECT format(
$f$ -- begin dynamic query string
SELECT * FROM crosstab(
$q$
SELECT month, product, sum(amount)
FROM ventas
GROUP BY 1, 2
ORDER BY 1, 2
$q$
, $c$VALUES (%s)$c$
) AS ct(month int, %s);
$f$ -- end dynamic query string
, string_agg(quote_literal(sub.product), '), (')
, string_agg(quote_ident (sub.product), ' int, ') || ' int'
)
FROM (SELECT DISTINCT product FROM ventas ORDER BY 1) sub;
This generates a query of the form:
SELECT * FROM crosstab(
$q$
SELECT month, product, sum(amount)
FROM ventas
GROUP BY 1, 2
ORDER BY 1, 2
$q$
, $c$VALUES ('AA'), ('BB'), ('CC')$c$
) AS ct(month int, "AA" int, "BB" int, "CC" int);
... which you then execute.
Closely related, with more explanation and options: