rubynomethoderror

What could be causing ruby NoMethodError backtrace to be so slow?


I have a pretty large ruby (non-rails) application that I'm developing. It's reasonably fast considering how large and complex it is (go ruby!), but sometimes I fat finger a method name and get the NoMethodError.

And usually when this happens, the application hangs for like 20 to 30 seconds to just print out the backtrace.

Specifically, if I do something like this:

puts "about to crash!"
Array.new().inspekt    # NoMethodError here

I see the "about to crash!" right away, and then 20s or so nothing seems to happen before I finally get the NoMethodError and backtrace.

At first I thought it might be the "did you mean" gem, so I turned that off with --disable-did_you_mean on the command line, and that turned off the "did you mean" suggestions, but nothing sped up the backtrace.

What's interesting is that this is only for NoMethodError.

If I cause some other exception, such as:

puts "about to crash!"
a = 3/0

Then I see the backtrace immediately.

And to make things even weirder, if I interrupt the process right after the "about to crash!" (such as with a ctrl-c on unix) then I immediately get the NoMethodError and it's backtrace. So it has the information - but ruby is stuck on trying to clean something up perhaps, something that only gets cleaned up on NoMethodError?

Info: ruby 2.7.0

OS: CentOS Linux release 7.5.1804

UPDATE - to responses so far: Everyone seems to be concerned about the backtrace and profiling the ruby code.

Except the slowdown is NOT happening there. There are NO LINES OF RUBY CODE that are executed during the slowdown. All of the lines prior to this, "in the backtrace" are already executed and in a matter of a second or so. Then the system hangs, between the puts and the NoMethodError. There is no ruby code in between to profile, so any profiler that is looking at code written in my ruby script isn't going to help. The slowdown is something internal to ruby and is not in my code, unless I'm terribly confused about what's happening.

To be very clear:

Line 10042:    puts "HERE"                  # Happens at ~1s
Line 10043:    Array.new().inspekt          # Happens at ~20-30s

There is no code between those lines to profile. The 20-30s is not happening in any code before line 10042 executes, so profiling that will not help.

I do have other Fibers that are paused. But there is no code here that yields to them. Is it possible that there's some strange built-in yield code that attempts to run other (paused) fibers when an exception is hit? I can't think of a reason you'd ever want this behavior, and many reasons why it would be catastrophic, but I can't think of anything else that would cause this problem (that is also killable with a ctrl-c!)


Solution

  • I finally figured out the issue.

    The problem was that NoMethodError will use inspect, and this inspects out all the instance variables of the receiver to the NoMethodError.

    Since I'm writing a fairly large program with massive objects containing lots of data, many of my top level objects take huge amounts of time to calculate the 'inspect' string.

    Interestingly, if you 'ctrl-c' the NoMethodError, it stops trying to build the inspect string and just uses the class name. I have a feeling the other instances were running out of memory trying to build the inspect string and would end up in the same place, printing a simple NoMethodError, with no clue as to what was happening (and failing) behind the scenes.

    The solution is pretty simple, I could just define 'inspect' to be more reasonable on all of my objects. Another option is to redefine the 'detailed_message' in NoMethodError so that it temporarily redefines the 'inspect' method on the receiver of the error to be 'to_s' before calculating the message. This could probably be generalized for other exceptions with the same types of issues, but I'm currently only seeing it on NoMethodError.

    Code is below - I'm probably doing something not quite right with the scope considering my (lack of) understanding of singleton_classes, but it seems to work properly in the various tests could come up with. Note that it only does this workaround if to_s is defined explicitly by the receiver - this is a check in the 4th line of the method. This will miss cases where, for example, you inherit a class and don't redefine to_s, so you may want to remove that for your code. YMMV.

    class NoMethodError
      def detailed_message(*a,**k)
        # Does the object *have* a to_s?  It could be unbound?
        return super(*a,**k) unless receiver.respond_to?(:to_s)
        # Does the object actually define to_s (itself)?
        return super(*a,**k) unless receiver.method(:to_s).owner == receiver.class
    
        # Save the old inspect
        oldInspect = receiver.respond_to?(:inspect) ? receiver.method(:inspect) : nil
    
        # Redefine inspect to call to_s
        receiver.define_singleton_method(:inspect) do; to_s; end
    
        # Get the detailed_message with to_s
        gotMessage = super(*a,**k)
    
        # Restore the inspect method
        if oldInspect
          receiver.define_singleton_method(:inspect, oldInspect)
        else
          # I think we could have just done this one...
          receiver.singleton_class.remove_method(:inspect)
        end
    
        # Return message
        gotMessage
      end
    end