I have a greenplum database instance running on docker. There is very little data in the tables+indexes (approx. 550 MB). I checked the size of all tables using the query below:
SELECT *, pg_size_pretty(total_bytes) AS total
, pg_size_pretty(index_bytes) AS INDEX
, pg_size_pretty(toast_bytes) AS toast
, pg_size_pretty(table_bytes) AS TABLE
FROM (
SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM (
SELECT c.oid,nspname AS table_schema, relname AS TABLE_NAME
, c.reltuples AS row_estimate
, pg_total_relation_size(c.oid) AS total_bytes
, pg_total_relation_size(c.oid) - pg_relation_size(c.oid) AS index_bytes
, pg_total_relation_size(reltoastrelid) AS toast_bytes
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE relkind = 'r'
) a
) a
order by total_bytes desc
The docker image is 4.7 GB. So approximate usage for this greenplum docker image should be (4.7 + 0.5 ) = 5.2 GB
. But, the docker container consumes 13GB disk space.
The disk usage is as below:
[gpadmin@mdw ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
overlay 17G 13G 4.7G 73% /
tmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/mapper/centos_greenplum01-root 17G 13G 4.7G 73% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 2.0G 0 2.0G 0% /proc/acpi
tmpfs 2.0G 0 2.0G 0% /proc/scsi
tmpfs 2.0G 0 2.0G 0% /sys/firmware
The host machine and docker are both CentOS.
As part of testing my application, I stop/start the docker container multiple times through the day.
Debug steps to identify if the root cause was docker or greenplum.
Login to docker:
cd /
df -schk *
Iteratively check the largest directories:
The cause for the issue is huge log files in /data/primary/gpseg1/pg_log
.
I removed all logs older than 2 days.