ruby-on-rails-3mongoidmoped

intermittent "Moped::Protocol::Command failed with error nil" errors in Resque workers


I'm using Rails 3.2, Mongoid 3.1.7, Moped 1.5.3, and Resque 1.24.1

I'm repeatedly getting "Moped::Protocol::Command failed with error nil" [1] errors in my Resque workers.

This issue, and this one in the Moped repo make me think this is probably due to Resque forking, but they're both about older versions of Moped than I'm using, and I've been unable to find any specific information specific to configuring Resque to work nicely with Mongoid.

How do I resolve the errors in my Resque workers?

[1] Full error: Exception: Moped::Errors::OperationFailure Error: The operation: #<Moped::Protocol::Command @length=58 @request_id=3 @response_to=0 @op_code=2004 @flags=[] @full_collection_name="admin.$cmd" @skip=0 @limit=-1 @selector={:ismaster=>1} @fields=nil> failed with error nil /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:99:in `block in command' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:600:in `[]' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:600:in `block (3 levels) in flush' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:599:in `map' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:599:in `block (2 levels) in flush' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:150:in `ensure_connected' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:595:in `block in flush' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:610:in `logging' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:594:in `flush' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:583:in `process' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:92:in `command' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/node.rb:409:in `refresh' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cluster.rb:171:in `block in refresh' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cluster.rb:185:in `each' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cluster.rb:185:in `refresh' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cluster.rb:135:in `nodes' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cluster.rb:206:in `with_primary' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/session/context.rb:108:in `with_node' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cursor.rb:137:in `load_docs' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/cursor.rb:25:in `each' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/query.rb:76:in `each' /shared/bundle/ruby/2.2.0/gems/moped-1.5.3/lib/moped/query.rb:76:in `each' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/contextual/mongo.rb:423:in `entries' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/contextual/mongo.rb:423:in `check_existence' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/contextual/mongo.rb:146:in `exists?' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/contextual.rb:19:in `exists?' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/validations/uniqueness.rb:273:in `validate_root' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/validations/uniqueness.rb:52:in `block in validate_each' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/validations/queryable.rb:23:in `with_query' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/validations/uniqueness.rb:46:in `validate_each' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validator.rb:153:in `block in validate' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validator.rb:150:in `each' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validator.rb:150:in `validate' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:310:in `_callback_before_265' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:418:in `_run__123721054543109017__validate__3207805434451540367__callbacks' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:405:in `__run_callback' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:385:in `_run_validate_callbacks' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:81:in `run_callbacks' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/callbacks.rb:130:in `run_callbacks' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validations.rb:228:in `run_validations!' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validations/callbacks.rb:53:in `block in run_validations!' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:403:in `_run__123721054543109017__validation__3207805434451540367__callbacks' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:405:in `__run_callback' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:385:in `_run_validation_callbacks' /shared/bundle/ruby/2.2.0/gems/activesupport-3.2.22/lib/active_support/callbacks.rb:81:in `run_callbacks' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/callbacks.rb:130:in `run_callbacks' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validations/callbacks.rb:53:in `run_validations!' /shared/bundle/ruby/2.2.0/gems/activemodel-3.2.22/lib/active_model/validations.rb:195:in `valid?' /shared/bundle/ruby/2.2.0/gems/mongoid-3.1.7/lib/mongoid/validations.rb:84:in `valid?' /releases/20160315172839/app/performers/garmin_receive_push.rb:11:in `block in perform' /releases/20160315172839/app/performers/garmin_receive_push.rb:9:in `map' /releases/20160315172839/app/performers/garmin_receive_push.rb:9:in `perform'


Solution

  • This seems to have worked. Went from 30+ errors in my Resque workers yesterday to 0 so far in the past 18 hours.

    Added to my Resque config:

    Resque.before_fork do
        ::Mongoid.default_session.disconnect
    end
    

    I don't have a very coherent description of exactly what was going wrong here. Resque workers fork a child process, and then the child processes the job, so my current operating assumption is that this is similar to the problems described with Unicorn and Passenger forking in the linked issues.