pythontracesys

Avoid tracing the outer exec() wrapper when debugging a Python script using sys.settrace()


I'm building a custom Python debugger using sys.settrace() to inspect and log execution events. My script accepts a .py input fil eas a parameter and inspects every event within it using:

code_str = wrap_input(INPUT_PATH)
x = compile(code_str, INPUT_PATH, 'exec')

sys.settrace(tracer)
exec(x)
sys.settrace(None)

The problem is that the tracer picks up events from both:

Here's a simplified log of what I see:

A call encountered in         <module>() at line number 0 
Line 1 → {'code_str': ..., 'x': <code object <module> at ...>}
Line 5 → {'code_str': ..., 'x': <code object <module> at ...>, 'foo': <function foo at ...>}
// File starts here
A call encountered in         foo() at line number 1
Line 2 → {}
Line 3 → {'x': 5}
A return encountered in         foo() at line number 3
// File ends here
A return encountered in         <module>() at line number 5

I'm only interested in tracing what actually happens inside the input file's code — not the outer wrapper.


What I already tried:

I attempted to filter frames like this:

if frame.f_code.co_filename != INPUT_PATH:
    return None

But oddly, frame.f_code.co_filename is equal to INPUT_PATH every time, even during the initial exec() call (which I want to ignore). So this filter doesn't help me distinguish between the "outer exec context" and the real logic inside the input file.


Question:

How can I avoid tracing the outer exec() logic while still tracing the input file itself?

In short, how do I tell when tracing has entered "actual" user-defined code (like functions in the input file), not just the wrapping call?


Answer (edited):

This is the modified tracer function:

def tracer(frame, event, arg = None):
    depth = 0
    current_frame = frame
    while current_frame:
        depth += 1
        current_frame = current_frame.f_back
    if depth > 3:
        [...]
        return tracer
    return None

Solution

  • Take a look at this other stackoverflow question. I think getting the current depth of the stack reliably distinguishes between execution contexts regardless of filename. It should automatically separate a wrapping call (shallow) from user-defined code (deep). From here, it says "The trace hook is modified by passing a callback function to sys.settrace()." So I think you can filter out events when the depth <= 3. As soon as the depth is > 3, you can start tracing since the deep stack is user-defined functions.