Delayed jobs service is stopping after less than 1 hour of its starting time with the following log:
I, [2018-02-26T06:00:26.580458 #11439] INFO -- : 2018-02-26T06:00:26+0400: [Worker(delayed_job host:myhost pid:11439)] Starting job worker
I, [2018-02-26T06:00:26.664929 #11439] INFO -- : 2018-02-26T06:00:26+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41019) RUNNING
I, [2018-02-26T06:00:27.342994 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41019) COMPLETED after 0.6779
I, [2018-02-26T06:00:27.346526 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41020) RUNNING
I, [2018-02-26T06:00:27.470858 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41020) COMPLETED after 0.1242
I, [2018-02-26T06:00:27.474937 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41024) RUNNING
I, [2018-02-26T06:00:27.603043 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41024) COMPLETED after 0.1280
I, [2018-02-26T06:00:27.606702 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41025) RUNNING
I, [2018-02-26T06:00:27.725715 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41025) COMPLETED after 0.1189
I, [2018-02-26T06:00:27.728021 #11439] INFO -- : 2018-02-26T06:00:27+0400: [Worker(delayed_job host:myhost pid:11439)] 4 jobs processed at 3.4871 j/s, 0 failed
I, [2018-02-26T06:14:48.287220 #11439] INFO -- : 2018-02-26T06:14:48+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41027) RUNNING
I, [2018-02-26T06:14:48.414079 #11439] INFO -- : 2018-02-26T06:14:48+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41027) COMPLETED after 0.1267
I, [2018-02-26T06:14:48.416335 #11439] INFO -- : 2018-02-26T06:14:48+0400: [Worker(delayed_job host:myhost pid:11439)] 1 jobs processed at 7.3771 j/s, 0 failed
I, [2018-02-26T06:16:33.492435 #11439] INFO -- : 2018-02-26T06:16:33+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41028) RUNNING
I, [2018-02-26T06:16:33.613684 #11439] INFO -- : 2018-02-26T06:16:33+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41028) COMPLETED after 0.1211
I, [2018-02-26T06:16:33.615953 #11439] INFO -- : 2018-02-26T06:16:33+0400: [Worker(delayed_job host:myhost pid:11439)] 1 jobs processed at 7.8121 j/s, 0 failed
I, [2018-02-26T06:22:33.853678 #11439] INFO -- : 2018-02-26T06:22:33+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41030) RUNNING
I, [2018-02-26T06:22:33.967338 #11439] INFO -- : 2018-02-26T06:22:33+0400: [Worker(delayed_job host:myhost pid:11439)] Job ActiveJob::QueueAdapters::DelayedJobAdapter::JobWrapper (id=41030) COMPLETED after 0.1136
I, [2018-02-26T06:22:33.970307 #11439] INFO -- : 2018-02-26T06:22:33+0400: [Worker(delayed_job host:myhost pid:11439)] 1 jobs processed at 8.2735 j/s, 0 failed
I, [2018-02-26T06:38:24.595215 #11439] INFO -- : 2018-02-26T06:38:24+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:24.593926', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:24.593351' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:24.593398') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:29.597026 #11439] INFO -- : 2018-02-26T06:38:29+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:29.596061', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:29.595477' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:29.595524') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:34.598775 #11439] INFO -- : 2018-02-26T06:38:34+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:34.597856', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:34.597278' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:34.597325') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:39.600772 #11439] INFO -- : 2018-02-26T06:38:39+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:39.599713', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:39.599063' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:39.599110') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:44.602546 #11439] INFO -- : 2018-02-26T06:38:44+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:44.601568', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:44.601024' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:44.601072') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:49.604286 #11439] INFO -- : 2018-02-26T06:38:49+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:49.603369', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:49.602808' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:49.602863') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:54.606189 #11439] INFO -- : 2018-02-26T06:38:54+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:54.605111', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:54.604563' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:54.604613') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:38:59.608610 #11439] INFO -- : 2018-02-26T06:38:59+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:38:59.607243', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:38:59.606483' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:38:59.606539') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:39:04.610465 #11439] INFO -- : 2018-02-26T06:39:04+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:39:04.609457', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:39:04.608876' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:39:04.608926') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
I, [2018-02-26T06:39:09.612201 #11439] INFO -- : 2018-02-26T06:39:09+0400: [Worker(delayed_job host:myhost pid:11439)] Error while reserving job: PG::ConnectionBad: PQsocket() can't get socket descriptor: UPDATE "delayed_jobs" SET locked_at = '2018-02-26 02:39:09.611263', locked_by = 'delayed_job host:myhost pid:11439' WHERE id IN (SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at <= '2018-02-26 02:39:09.610721' AND (locked_at IS NULL OR locked_at < '2018-02-25 22:39:09.610770') OR locked_by = 'delayed_job host:myhost pid:11439') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
database.yml
production:
adapter: postgresql
encoding: unicode
database: myapp
port: 5432
pool: 5
username: username
password: password
reconnect: true
Please can anyone just explain the reason of this error and how to avoid it:
Error while reserving job: PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly
Update:
I believe this issue is not related to delayed jobs, since I'm getting same error while I'm doing some normal DB queries. So DB is restarting for some reason and this why delayed job service is stopping.
As per commented by @LaurenzAlbe , below are some issues found in /var/log/postgresql/postgresql-9.3-main.log
:
LOG: connection received: host=10.10.10.15 port=57322
LOG: replication connection authorized: user=MyDBUser
FATAL: must be superuser or replication role to start walsender
LOG: could not receive data from client: Connection reset by peer
LOG: disconnection: session time: 0:06:18.911 user=MyDBUser database=MyDB host=127.0.0.1 port=34040
./systemd: 36: kill: Operation not permitted
WARNING: skipping "delayed_jobs" --- only table or database owner can analyze it
After some research, I believe delayed jobs have a memory leak issue, which is solved by using config.cache_classes = true
because delayed jobs keeps reloading classes every now and then.
I had the same issue where delayed job process used more than 90%+ of memory and it crashed with the same error, even though I had cached classes, but I was running delayed jobs without RAILS_ENV
, which caused them to load in development and ignore the other environment settings.