Problem
The following query takes 42 seconds when most of the data in not cached:
EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=395914.63..395914.63 rows=1 width=8) (actual time=42411.940..42411.942 rows=1 loops=1)
Buffers: shared hit=21775 read=94888
I/O Timings: read=39723.315
-> Bitmap Heap Scan on packages (cost=1053.07..395761.41 rows=306442 width=0) (actual time=83.104..42336.765 rows=322432 loops=1)
Recheck Cond: (company_id = 178381)
Heap Blocks: exact=116385
Buffers: shared hit=21775 read=94888
I/O Timings: read=39723.315
-> Bitmap Index Scan on packages_company_id_index (cost=0.00..1037.75 rows=306442 width=0) (actual time=45.846..45.847 rows=325795 loops=1)
Index Cond: (company_id = 178381)
Buffers: shared hit=1 read=277
I/O Timings: read=7.090
Planning:
Buffers: shared hit=2
Planning Time: 0.237 ms
Execution Time: 42413.042 ms
Running the query again immediately after is naturally much faster:
distru_prod=> EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=395914.63..395914.63 rows=1 width=8) (actual time=416.943..416.957 rows=1 loops=1)
Buffers: shared hit=116589
-> Bitmap Heap Scan on packages (cost=1053.07..395761.41 rows=306442 width=0) (actual time=78.925..395.495 rows=322432 loops=1)
Recheck Cond: (company_id = 178381)
Heap Blocks: exact=116308
Buffers: shared hit=116589
-> Bitmap Index Scan on packages_company_id_index (cost=0.00..1037.75 rows=306442 width=0) (actual time=46.359..46.360 rows=325351 loops=1)
Index Cond: (company_id = 178381)
Buffers: shared hit=281
Planning:
Buffers: shared hit=448
Planning Time: 1.375 ms
Execution Time: 418.321 ms
More Info
Here is some basic info about this table:
select count(distinct company_id) from packages;
count
-------
691
select count(*) from packages;
count
----------
10764441
select count(*) from packages where company_id = 178381;
count
--------
322432
select pg_size_pretty(pg_total_relation_size('packages'));
pg_size_pretty
----------------
12 GB
select pg_size_pretty(pg_total_relation_size('packages_company_id_index'));
pg_size_pretty
----------------
79 MB
Index being used:
CREATE INDEX packages_company_id_index ON public.packages USING btree (company_id);
Even when using pg_hint_plan
to force an Index Only Scan on packages_company_id_index
, buffers
are still as high as when doing a Bitmap Heap Scan:
/*+ IndexOnlyScan(packages packages_company_id_index) */ EXPLAIN (ANALYZE, BUFFERS) select count(*) from packages where company_id = 178381;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=517401.31..517401.32 rows=1 width=8) (actual time=172.448..172.450 rows=1 loops=1)
Buffers: shared hit=116586
-> Index Only Scan using packages_company_id_index on packages (cost=0.09..517248.09 rows=306442 width=0) (actual time=0.034..150.510 rows=322432 loops=1)
Index Cond: (company_id = 178381)
Heap Fetches: 325351
Buffers: shared hit=116586
Planning:
Buffers: shared hit=2
Planning Time: 0.238 ms
Execution Time: 172.546 ms
Questions
buffers
as the original query even though the index is only 79MB in disk?packages
table at all?The Index-only scan reads so many buffers because it is not really an index-only scan. VACUUM
the table to refresh the visibility map, and you'll see the difference.
Probably because the visibility map is not updated and perhaps because you left random_page_cost
at a value that is too high.
According to the data you posted, the table should be at least 900 MB in size (116589 pages at 8 kB each).
Would you expect your disk to read more than 19MB/s?
Unanswerable, because it depends on an unknown "if".
See 1. and 2. above.