postgresqlexplain

Misunderstanding about postgresql temp files


PostgreSQL 13.7

I have pg_stat_statements extension in database postgres on all postgresql servers.

Pmm-agent makes a request to this extension every minute:

SELECT /* pmm-agent:pgstatstatements */ pg_stat_statements.userid, pg_stat_statements.dbid, pg_stat_statements.queryid, pg_stat_statements.query, pg_stat_statements.calls, pg_stat_statements.total_exec_time, pg_stat_statements.rows, pg_stat_statements.shared_blks_hit, pg_stat_statements.shared_blks_read, pg_stat_statements.shared_blks_dirtied, pg_stat_statements.shared_blks_written, pg_stat_statements.local_blks_hit, pg_stat_statements.local_blks_read, pg_stat_statements.local_blks_dirtied, pg_stat_statements.local_blks_written, pg_stat_statements.temp_blks_read, pg_stat_statements.temp_blks_written, pg_stat_statements.blk_read_time, pg_stat_statements.blk_write_time FROM pg_stat_statements WHERE queryid IS NOT NULL AND query IS NOT NULL

On one of my servers this request is causing temporary files usage.

EXPLAIN (ANALYZE, BUFFERS, SETTINGS)
"Function Scan on pg_stat_statements  (cost=0.00..10.00 rows=990 width=168) (actual time=1068.291..1207.841 rows=3401 loops=1)"
"  Filter: ((queryid IS NOT NULL) AND (query IS NOT NULL))"
"  Buffers: temp read=30382 written=30382"
"Settings: effective_cache_size = '24GB', effective_io_concurrency = '200', max_parallel_workers = '12', max_parallel_workers_per_gather = '6', random_page_cost = '1.1', temp_buffers = '16MB', work_mem = '40MB'"
"Planning:"
"  Buffers: shared hit=6"
"Planning Time: 0.153 ms"
"Execution Time: 1244.587 ms"

temp files usage

On another servers execution_plan for this query is normal, for example:

"Function Scan on pg_stat_statements  (cost=0.00..10.00 rows=990 width=168) (actual time=0.939..1.064 rows=364 loops=1)"
"  Filter: ((queryid IS NOT NULL) AND (query IS NOT NULL))"
"Settings: cpu_index_tuple_cost = '0.0005', effective_cache_size = '16GB', effective_io_concurrency = '200', max_parallel_workers = '10', max_parallel_workers_per_gather = '6', parallel_tuple_cost = '0.05', random_page_cost = '1.1', temp_buffers = '16MB', work_mem = '153MB'"
"Planning:"
"  Buffers: shared hit=6"
"Planning Time: 0.140 ms"
"Execution Time: 2.224 ms"

There are no sorts or joins in the query, why are temporary files used?

Increasing work_mem to 300MB did not solve the problem.


Solution

  • You see from the execution plan that this view is defined with a function. Now PostgreSQL collects the function results in a data structure called “tuple store”. As long as the tuple store fits in work_mem, it is kept in RAM. If it exceeds that size, it is spooled to a temporary file.

    So either there are more data in the first case (pg_stat_statements.max has a higher value, or the statements have more text), or work_mem is smaller in the second case.