So I have a quite a few workers that execute frequently ranging from daily to hourly, etc. There have been incidents where a few of them just did not execute without any signature or failure. I need to come up with a solution to track these. I thought about having a listener that logs every time a worker starts, but there's just too many workers to keep track of. A better approach would be for me to know when a worker ~did not~ run. That is more important.
I've thought about creating a table where I could add logs for when workers start execution and if the last log for that worker is too long ago (longer than the gap of time it is supposed to have) then it notifies me.
This approach should give you some ideas of how you might use the Sidekiq API to notify perhaps using a slack notifier class, you might put this in a worker and run it on some other schedule, of course if this were to fail because of resources, well that's a compounding problem. But hopefully you have some priorities in your queues.
class SlackNotifier
require 'net/http'
require 'uri'
require 'openssl'
attr_reader :params
def initialize(params)
@params = params
end
def notify
return if ENV['SLACK_WEBHOOK'].nil?
channel = "dev"
uri = URI.parse ENV['SLACK_WEBHOOK']
http = Net::HTTP.new(uri.host, uri.port)
http.verify_mode = OpenSSL::SSL::VERIFY_NONE unless defined?(Rails) && Rails.env.production?
http.use_ssl = true
request = Net::HTTP::Post.new(uri.request_uri)
request.body = "payload={'channel': '#{channel}', 'username': 'webhookbot', 'text': '#{params[:text]}'}"
http.request(request)
end
end
long = Sidekiq::Queue.new('long_running')
whats_taking_so_long = long.select{|j| j.enqueued_at < 8.hours.ago }
whats_taking_so_long.each do |long|
SlackNotifier.new(text: long.item.to_s).notify
end