postgresqlvacuum

Is Aggressive Auto vacuum impacting Query performance?


On postgres 14, we observed a 4x spike in the frequency of auto vacuum with correlation to our usual morning increase in production traffic. Database logs showed "Aggressive autovaccum to prevent wraparound".

Number of locks held and queries in pending state spiked, eventually leading to the database being un-responsive to queries. Don't know what caused the lock conflict.

Our auto vacuum settings from pg.conf are below:

autovacuum |on
autovacuum_analyze_scale_factor |0.05
autovacuum_analyze_threshold |50
autovacuum_freeze_max_age |200000000
autovacuum_max_workers |12
autovacuum_multixact_freeze_max_age |400000000
autovacuum_naptime |5s
autovacuum_vacuum_cost_delay |10ms
autovacuum_vacuum_cost_limit |-1
autovacuum_vacuum_insert_scale_factor |0.2
autovacuum_vacuum_insert_threshold |1000
autovacuum_vacuum_scale_factor |0.02
autovacuum_vacuum_threshold |50
autovacuum_work_mem |-1

The queries are mostly index based CRUD operations (SELECT, UPDATE)roughly around 500 TPS on 100GB-300GB sized tables.

We are methodically trying to find out the root-cause of the slowdown. Currently the spike in auto-vacuum is our leading hypothesis. Will appreciate feedback or any changes to the pg.conf settings.


Solution

  • Yes, crazy autovacuum parameter values like that can bring down your database. Together with the increased duration and resource utilization that an anti-wraparound autovacuum run brings, we have a good explanation for what you observed.

    First, you allow for 12 autovacuum workers. Obviously too many for your hardware. On the other hand, autovacuum_vacuum_cost_delay is set to the very high value 10ms, which makes autovacuum slow. I won't even start to discuss the other parameters. At any rate, you would have done better leaving all those parameters at their default values.