pythonexceptioninterpreterpythoninterpreter

Unexpected line given when Python interpreter reports exception


When the Python interpreter reports an error/exception (I'm just going to say "error" to refer to both of these from now on), it prints the line number and contents of the line that caused the error.

Interestingly, if you have a long-running Python script which causes an error and change the .py file while the script is running, then the interpreter can report an incorrect line as raising the error, based on the changed contents of the .py file.

MWE:

sample.py

from time import sleep

for i in range(10):
    print(i)
    sleep(1)

raise Exception("foo", "bar")

This script runs for 10 seconds, then raises an exception.

sample2.py

from time import sleep

for i in range(10):
    print(i)
    sleep(1)
"""
This
is
just
some
filler
to
demonstrate
the
behavior
"""
raise Exception("foo", "bar")

This file is identical to sample.py except that it has some junk between the end of the loop and the line raises the following exception:

Traceback (most recent call last):
  File "sample.py", line 7, in <module>
Exception: ('foo', 'bar')

What I Did

  1. python3 sample.py
  2. In a second terminal window, mv sample.py sample.py.bak && cp sample2.py sample.py before sample.py finishes execution

Expected Behavior

The interpreter reports the following:

Traceback (most recent call last):
  File "sample.py", line 7, in <module>
Exception: ('foo', 'bar')

Here, the interpreter reports that there was an exception on line 7 of sample.py and prints the Exception.

Actual Behavior

The interpreter reports the following:

Traceback (most recent call last):
  File "sample.py", line 7, in <module>
    """
Exception: ('foo', 'bar')

Here, the interpreter also reports """ when it reports the exception. It seems to be looking in the file on disk to find this information, rather than the file loaded into memory to run the program.

Source of my Confusion

The following is my mental model for what happens when I run python3 sample.py:

  1. The interpreter loads the contents of sample.py into memory
  2. The interpreter performs lexical analysis, semantic analysis, code generation, etc. to produce machine code
  3. The generated code is sent to the CPU and executed
  4. If an error is raised, the interpreter consults the in-memory representation of the source code to produce an error message

Clearly, there is a flaw in my mental model.

What I want to know:

  1. Why does the Python interpreter consult the file on disk to generate error message, rather than looking in memory?
  2. Is there some other flaw in my understanding of what the interpreter is doing?

Solution

  • As per the answer linked by @b_c,

    Python doesn't keep track of what source code corresponds to any compiled bytecode. It might not even read that source code until it needs to print a traceback.

    [...]

    When Python needs to print a traceback, that's when it tries to find source code corresponding to all the stack frames involved. The file name and line number you see in the stack trace are all Python has to go on

    [...]

    The default sys.excepthook goes through the native call PyErr_Display, which eventually winds up using _Py_DisplaySourceLine to display individual source lines. _Py_DisplaySourceLine unconditionally tries to find the file in the current working directory (for some reason - misguided optimization?), then calls _Py_FindSourceFile to search sys.path for a file matching that name if the working directory didn't have it.