I'm trying to find size of data in the table year wise.
I have a large table around 150 columns and billions of records, and also having date
column in it based on year I want to show the size of data in the GB
.
I found we can get table size using SELECT pg_size_pretty( pg_total_relation_size('tablename') );
but didn't understand how to get actual data size based on condition.
Expected Output:
year | size(GB)
--------------------
2010 10
2015 40
2020 80
More or less as per the comment:
demo at db<>fiddle
select date_part('year', created_at)-date_part('year', created_at)::int % 5
as year
,pg_size_pretty(sum(pg_column_size(test)))
from test
group by 1
order by 1;
year | pg_size_pretty |
---|---|
1985 | 100 kB |
1990 | 5396 kB |
1995 | 8598 kB |
2000 | 12 MB |
2005 | 16 MB |
2010 | 18 MB |
2015 | 21 MB |
2020 | 24 MB |
The size includes the date
-type created_at
column - I'm feeding entire rows of test
into pg_column_size()
without unpacking the fields, same way you can select x from x
- that's the table name used as a single selected thing. Group by date_trunc()
or date_part()
/extract()
, subtracting modulo 5 (rounding down to 0 or 5 in units) and sum()
that.
To only measure the payload and not the date
s, you will need to sum pg_column_size()
of each column with a bunch of additions +
(there's no variadic sum()
function out of the box).
Not date_bin()
after all, because it's limited to less than a month per bucket:
The stride interval must be greater than zero and cannot contain units of month or larger.
The demo shows an example of how to dynamically generate the uglier query that excludes date
s from the sum of sizes, based on information_schema.columns
.
The result won't include how much space the values take up in the indexes, only how much they occupy in the table and toast, post compression:
If applied directly to a table column value, this reflects any compression that was done.
If the table is partitioned by the date column, you can collect pg_total_relation_size()
for all partitions, union that and aggregate over it, which would also tell you how much space their indexes take.