ruby-on-railsredissidekiqbluepillcloud66

Sidekiq keeps rebooting Cloud66


I have been struggling with this issue for a while and just can't figure it out. I am trying to get Redis and Sidekiq to handle background jobs for my Rails Project hosted on Cloud66 w/ Digital Ocean. All needed gems seem to be present and the setup works perfectly locally.

My first attempt was using these settings:

This is my config/sidekiq.yaml file:

---
:concurrency: 25
:pidfile: ./tmp/pids/sidekiq.pid
:logfile: ./log/sidekiq.log
:queues:
  - default
  - [high_priority, 2]
:daemon: true

As per this tutorial https://mikecoutermarsh.com/setting-up-redis-on-cloud66-for-sidekiq/ this is my the content Procfile:

worker: env RAILS_ENV=$RAILS_ENV REDIS_URL=$REDIS_URL_INT bundle exec sidekiq -C config/sidekiq.yml

$REDIT_URL_INT is an ENV variable for : redis://104.236.131.187:6379. This ENV variable deviates from the one in the tutorial (includes port) as per a suggestion in the comments of the blogpost.

After deploying with these settings my Sidekiq log gives me the following:

2015-05-16T16:19:44.732Z 14636 TID-1g96vc INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:13.801Z 14701 TID-3trg0 INFO: Running in ruby 2.1.5p273 (2014-11-13 revision 48405) [x86_64-linux]
2015-05-16T16:20:13.823Z 14701 TID-3trg0 INFO: See LICENSE and the LGPL-3.0 for licensing details.
2015-05-16T16:20:13.823Z 14701 TID-3trg0 INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org/pro
2015-05-16T16:20:15.167Z 14701 TID-18nsv4 INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:15.180Z 14701 TID-7791g INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:32.065Z 14753 TID-6uz3g INFO: Running in ruby 2.1.5p273 (2014-11-13 revision 48405) [x86_64-linux]
2015-05-16T16:20:32.066Z 14753 TID-6uz3g INFO: See LICENSE and the LGPL-3.0 for licensing details.
2015-05-16T16:20:32.066Z 14753 TID-6uz3g INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org/pro
2015-05-16T16:20:32.129Z 14753 TID-1bl0r0 INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}
2015-05-16T16:20:54.584Z 14852 TID-5t1rs INFO: Running in ruby 2.1.5p273 (2014-11-13 revision 48405) [x86_64-linux]
2015-05-16T16:20:54.585Z 14852 TID-5t1rs INFO: See LICENSE and the LGPL-3.0 for licensing details.
2015-05-16T16:20:54.585Z 14852 TID-5t1rs INFO: Upgrade to Sidekiq Pro for more features and support: http://sidekiq.org/pro
2015-05-16T16:20:54.665Z 14852 TID-1aj3m0 INFO: Booting Sidekiq 3.3.2 with redis options {:url=>"redis://104.236.131.187:6379"}

Giving me the impression that Sidekiq is rebooting all the time. So I checked out the Sidekiq processes:

12747 ?        Sl     0:10 sidekiq 3.3.2 web_head [0 of 25 busy]
13540 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13596 ?        Sl     0:08 sidekiq 3.3.2 web_head [0 of 25 busy]
13650 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
13702 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
13758 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13818 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13869 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13934 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
13986 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
14089 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14144 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14196 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14259 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14311 ?        Sl     0:06 sidekiq 3.3.2 web_head [0 of 25 busy]
14363 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14421 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14474 ?        Sl     0:07 sidekiq 3.3.2 web_head [0 of 25 busy]
14530 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14585 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14636 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14701 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14753 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14852 ?        Sl     0:05 sidekiq 3.3.2 web_head [0 of 25 busy]
14913 ?        Sl     0:04 sidekiq 3.3.2 web_head [0 of 25 busy]
14966 ?        Sl     0:04 sidekiq 3.3.2 web_head [0 of 25 busy]
15023 ?        Sl     0:04 sidekiq 3.3.2 web_head [0 of 25 busy]

That a lot of Sidekiq action! I didn't ask for that. Just one is all I need.

My current theory is that I am missing a link between the Rails / Sidekiq / Redis setup. So I added a Redis config/redis/production.conf:

daemonize yes
port 6379
logfile ./log/redis_production.log
dbfilename ./db/production.rdb

This made no difference. Also, there is no redis_production.log or production.rbd being created. So I guess cloud66 is taking care of the Redis part. If I checkout the webconsole the redis server is running on the correct port.

I believe Cloud66 uses Bluepil to manage their processes. There is the following logfile called user_worker_pill.log:

I, [2015-05-16T16:28:27.157623 #11066]  INFO -- : [user_worker:worker:user_worker_1] Going from down => starting
E, [2015-05-16T16:28:47.183939 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
E, [2015-05-16T16:28:47.185674 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
I, [2015-05-16T16:28:47.618515 #11066]  INFO -- : [user_worker:worker:user_worker_1] Going from starting => down
E, [2015-05-16T16:28:48.627548 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
E, [2015-05-16T16:28:48.629944 #11066] ERROR -- : [user_worker:worker:user_worker_1] Failed to signal process 16244 with code 0: No such process
D, [2015-05-16T16:28:48.991312 #11066] DEBUG -- : [user_worker] pid journal file: /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1
D, [2015-05-16T16:28:48.993154 #11066] DEBUG -- : [user_worker] pid journal = 16244
D, [2015-05-16T16:28:48.993257 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
D, [2015-05-16T16:28:48.993396 #11066] DEBUG -- : [user_worker] Unable to term missing process 16244
D, [2015-05-16T16:28:48.993535 #11066] DEBUG -- : [user_worker] Journal cleanup completed
D, [2015-05-16T16:28:48.993595 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
D, [2015-05-16T16:28:48.993654 #11066] DEBUG -- : [user_worker] pgid journal file: /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1
D, [2015-05-16T16:28:48.993829 #11066] DEBUG -- : [user_worker] pgid journal = 16241
D, [2015-05-16T16:28:48.993901 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
D, [2015-05-16T16:28:48.993994 #11066] DEBUG -- : [user_worker] Unable to term missing process group 16241
D, [2015-05-16T16:28:48.995031 #11066] DEBUG -- : [user_worker] Journal cleanup completed
D, [2015-05-16T16:28:48.995180 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
W, [2015-05-16T16:28:48.995344 #11066]  WARN -- : [user_worker:worker:user_worker_1] Executing start command: env RAILS_ENV=production REDIS_URL=redis://104.236.131.187:6379 bundle exec sidekiq -C config/sidekiq.yml
D, [2015-05-16T16:28:49.457935 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
D, [2015-05-16T16:28:49.458693 #11066] DEBUG -- : [user_worker] pgid journal file: /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1
D, [2015-05-16T16:28:49.459430 #11066] DEBUG -- : [user_worker] Saving pgid 16296 to process journal user_worker_1
I, [2015-05-16T16:28:49.459854 #11066]  INFO -- : [user_worker] Saved pgid 16296 to journal user_worker_1
D, [2015-05-16T16:28:49.460220 #11066] DEBUG -- : [user_worker] Journal now = 16296

D, [2015-05-16T16:28:49.460454 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pgids_journal.user_worker_1.lock
D, [2015-05-16T16:28:49.460656 #11066] DEBUG -- : [user_worker] Acquired lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock
D, [2015-05-16T16:28:49.460901 #11066] DEBUG -- : [user_worker] pid journal file: /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1
D, [2015-05-16T16:28:49.461174 #11066] DEBUG -- : [user_worker] Saving pid 16299 to process journal user_worker_1
I, [2015-05-16T16:28:49.462289 #11066]  INFO -- : [user_worker] Saved pid 16299 to journal user_worker_1
D, [2015-05-16T16:28:49.462563 #11066] DEBUG -- : [user_worker] Journal now = 16299

D, [2015-05-16T16:28:49.462916 #11066] DEBUG -- : [user_worker] Cleared lock /var/run/bluepill/journals/.bluepill_pids_journal.user_worker_1.lock

This is outside of my limited expertise in the matter, but it seems to me that it is trying to bring the crashed process back up repeatedly using the command in the Procfile.

This is all the information I have been able to gather, I have no idea how to proceed. I would really, really appreciate any insight, comments or suggestions.

THANKS!

/EDIT

After the comment from Phillip I changed the $REDIS_URL_INT to $REDIT_ADDRESS (IP without port) and this is the sidekiq.log :

2015-05-18T14:00:05.683Z 15878 TID-1dm310 ERROR: heartbeat: Waited 1 sec
2015-05-18T14:00:07.769Z 15878 TID-boxzc ERROR: Waited 1 sec
2015-05-18T14:00:07.769Z 15878 TID-boxzc ERROR: /var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:85:in `block (2 levels) in pop'
2015-05-18T14:00:08.770Z 15878 TID-boxzc WARN: {:context=>"scheduling poller thread died!"}
2015-05-18T14:00:08.771Z 15878 TID-boxzc WARN: Waited 1 sec
2015-05-18T14:00:08.771Z 15878 TID-boxzc WARN: /var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:85:in `block (2 levels) in pop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:77:in `loop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:77:in `block in pop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:76:in `synchronize'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:76:in `pop'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool.rb:78:in `checkout'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool.rb:60:in `with'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq.rb:74:in `redis'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/api.rb:634:in `cleanup'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/api.rb:627:in `initialize'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:87:in `new'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:87:in `poll_interval'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:66:in `block in poll'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/util.rb:16:in `watchdog'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/sidekiq-3.3.2/lib/sidekiq/scheduled.rb:51:in `poll'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/calls.rb:122:in `dispatch'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
/var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
2015-05-18T14:00:08.774Z 15878 TID-1dm5j0 WARN: Sidekiq died due to the following error, cannot recover, process exiting
2015-05-18T14:00:08.775Z 15878 TID-1dm5j0 WARN: Waited 1 sec
2015-05-18T14:00:08.776Z 15878 TID-1dm5j0 WARN: /var/deploy/gemconn/web_head/shared/bundle/ruby/2.1.0/gems/connection_pool-2.1.1/lib/connection_pool/timed_stack.rb:85:in `block (2 levels) in pop'

Solution

  • I'm adding another answer to make this solution clearer. I had a closer look, and your Sidekiq configuration is actually daemonizing the process, while the processes should run in the foreground for us to control them. This is why you were seeing so many Sidekiq processes running - our bluepill would start one, think it hadn't come up, so start more.

    If you remove :daemon: true from your sidekiq.yml and redeploy, this should solve the problem.