pythondebugginginfinite-loopzope

Get stacktrace from stuck python process that does not accept signals


I have to run a legacy Zope2 website and have some grievance with it. The biggest issue is that, occasionally, it just locks up, running at 100% CPU load and not answering to requests anymore. While the problem isn't reproducible on a regular basis, one page containing 3 dynamic graphs triggers it sometimes, so I suspect some kind of race condition that leads to an endless loop or a stuck busywait.

The problem is, I have not yet found a way to debug this thing. There's nothing in the Zope logs and nothing in the system logs. I tried the suggestions from this question to get a stacktrace, but the only signal that has any effect is SIGKILL.

Is there another possibility to find out where exactly the process is when it gets stuck?


Solution

  • If the process is stuck in a way that no other signal gets through, you might want to consider running it from a debugger, instead of trying to attach to it at runtime.

    Also, it might be useful to other debugging tactics, like turning off certain parts of the code to find out the minimal case in which it is still reproducible in order to see what causes it better.