I'm currently using god to start 6 resque worker processes. Resque show's that they are started and working and everything is working. Occasionally a worker process drops out of recognition and ceases to be a known resque worker process. What I'm looking for is a way to restart that process or have resque-web recognize it again. What's weird is it's still running in the background and forking tasks to work on them and I can see the number decrease on resque-web, but it doesn't show that any workers are running. I've looked into their stale.god script, but that doesn't work because the process appears to keep retrieving jobs after it drops from recognition of resque-web. Here is my setup:
#resque-production.god
6.times do |num|
God.watch do |w|
w.name = "resque-#{num}"
w.group = "resque"
w.interval = 30.seconds
w.env = { 'RAILS_ENV' => 'production' }
w.dir = File.expand_path(File.join(File.dirname(__FILE__)))
w.start = "bundle exec rake environment RAILS_ENV=production resque:workers:start"
w.start_grace = 10.seconds
w.log = "/var/www/loadmax/shared/log/resque-worker.log"
# restart if memory gets too high
w.transition(:up, :restart) do |on|
on.condition(:memory_usage) do |c|
c.above = 200.megabytes
c.times = 2
end
end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
end
end
end
end
The next file is used for connecting to one redis server and setting priorities.
#resque.rake
require 'resque/tasks'
Dir.glob("#{Rails.root}/app/workers/*.rb") do |rb|
require rb
end
task "resque:setup" => :environment do
resque_config = YAML.load_file(Rails.root.join("config","resque.yml"))
ENV['QUEUE'] = resque_config["priority"].map{ |x| "#{x}" }.join(",") if ENV['QUEUE'].nil?
end
task "resque:workers:start" => :environment do
threads = []
q = [1,2]
resque_config = YAML.load_file(Rails.root.join("config","resque.yml"))
threads << Thread.new(q){ |qs|
%x[bundle exec rake environment RAILS_ENV=#{Rails.env} resque:work QUEUE=#{resque_config["priority"].map{ |x| "#{x}" }.join(",")} ]
}
threads.each {|aThread| aThread.join }
end
I've been looking all over for a solution for this and zombie processes, stale processes, and exiting processes don't seem to be a solution. I'm using god -c /path/to/god
to start.
Let me know if I need to provide anything else or be more clear. Thanks for all the help!
I ended up putting redis on the same box as the workers and they have been functioning properly since.