I have a multitenant application which was hitting connection limits and crashing Elasticsearch when using async reindexing via an after_commit
hook. I've implemented bulk reindexing as suggested in the readme:
https://github.com/ankane/searchkick#queuing
I'm running a job every minute to keep things up to date. Elasticsearch is no longer crashing (most of the time), but I've hit the DB connection limit many times. This is the error in Sidekiq:
ActiveRecord::ConnectionTimeoutError: could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use
It's somewhat obvious why this is happening:
class DocumentReindexQueueWorker
include Sidekiq::Worker
def perform
Account.find_each do |account|
Apartment::Tenant.switch(account.subdomain) do
Searchkick::ProcessQueueJob.perform_later(class_name: "Document")
end
end
end
end
This bombards the DB with connections because all tenants are running their bulk reindex at the same moment. This is where I would normally introduce some kind of staggered delay using delay_for
and setting retry: false
so jobs don't pile up when they're just going to get queued again in another minute. These options don't seem to exist for Searchkick::ProcessQueueJob
though.
I have a concurrency limit of 3 with ActiveJob::TrafficControl
(per this). The max_connections
of the DB is 100 but it's ActiveRecord that's reporting the connection limit. Could it be that I just need to increase RAILS_MAX_THREADS
? That's currently set to 5.
FYI running SELECT * FROM pg_stat_activity
shows 7 puma connections and 5 sidekiq connections.
You can change connection pool size in your config/database.yml
file:
pool: 40