I tried to vacuum a table which is 790 GB in size:
wento=# VACUUM (VERBOSE, ANALYZE) public.wento;
INFO: vacuuming "wento.public.wento"
INFO: finished vacuuming "wento.public.wento": index scans: 0
pages: 0 removed, 19249 remain, 727 scanned (3.78% of total)
tuples: 0 removed, 2543622 remain, 5034 are dead but not yet removable
removable cutoff: 2797274, which was 18769 XIDs old when operation ended
index scan bypassed: 160 pages from table (0.83% of total) have 638 dead item identifiers
avg read rate: 98.195 MB/s, avg write rate: 23.735 MB/s
buffer usage: 622 hits, 724 misses, 175 dirtied
WAL usage: 175 records, 175 full page images, 1372087 bytes
system usage: CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.05 s
INFO: vacuuming "wento.pg_toast.pg_toast_19276"
^C
Cancel request sent
ERROR: canceling statement due to user request
CONTEXT: while scanning block 102413194 of relation "pg_toast.pg_toast_19276"
I understand that the dead tuples weren't removed since there is a long running transaction.
But I want to understand why did it only scan 727 pages (3.78% of total)?
VACUUM
normally only scans pages that have received writes since the last iteration. The manual:
VACUUM
uses the visibility map to determine which pages of a table must be scanned. Normally, it will skip pages that don't have any dead row versions even if those pages might still have row versions with oldXID
values.
There is more fine print in the manual.