def f():
print("Before", locals()) # line 2
print(x); # line 3
x = 2 # line 4
print("After", locals()) # line 5
x = 1
f()
I am aware of the LEGB rule for scoping in Python.
For the above code, when I comment out line 4, everything executes normally as expected: for line 3, python does not find variable x
in the local scope and therefore searches it in the global scope where it finds it and prints 1.
But when I execute the whole code as it is without commenting, it raises UnboundLocalError: local variable 'x' referenced before assignment
.
I do know I can use nonlocal and global, but my question is :
I tried finding the answer in similar questions suggestions but failed. Please correct if any of my understanding is wrong.
To some extent, the answer is implementation specific, as Python only specifies the expected behavior, not how to implement it.
That said, let's look at the byte code generated for f
by the usual implementation, CPython:
>>> import dis
>>> dis.dis(f)
2 0 LOAD_GLOBAL 0 (print)
2 LOAD_CONST 1 ('Before')
4 LOAD_GLOBAL 1 (locals)
6 CALL_FUNCTION 0
8 CALL_FUNCTION 2
10 POP_TOP
3 12 LOAD_GLOBAL 0 (print)
14 LOAD_FAST 0 (x)
16 CALL_FUNCTION 1
18 POP_TOP
4 20 LOAD_CONST 2 (2)
22 STORE_FAST 0 (x)
5 24 LOAD_GLOBAL 0 (print)
26 LOAD_CONST 3 ('After')
28 LOAD_GLOBAL 1 (locals)
30 CALL_FUNCTION 0
32 CALL_FUNCTION 2
34 POP_TOP
36 LOAD_CONST 0 (None)
38 RETURN_VALUE
There are several different LOAD_*
op codes used to retrieve various values. LOAD_GLOBAL
is used for names in the global scope; LOAD_CONST
is used for local values not assigned to any name. LOAD_FAST
is used for local variables. Local variables don't even exist by name, but by indices in an array. That's why they are "fast"; they are available in an array rather than a hash table. (LOAD_GLOBAL
also uses integer arguments, but that's just an index into an array of names; the name itself still needs to be looked up in whatever mapping provides the global scope.)
You can even see the constants and local values associated with f
:
>>> f.__code__.co_consts
(None, 'Before', 2, 'After')
>>> f.__code__.co_varnames
('x',)
LOAD_CONST 1
puts Before
on the stack because f.__code__.co_consts[1] == 'Before'
, and LOAD_FAST 0
puts the value of x
on the stack because f.__code__.co_varnames[0] == 'x'
.
The key here is that the byte code is generated before f
is ever executed. Python isn't simply executing each line the first time it sees it. Executing the def
statement involves, among other things:
__code__
attribute of the function object.Part of the code generation is noting that the name x
, due to the assignment somewhere in the body of the function (even if that function is logically unreachable), is a local name, and therefore must be accessed with LOAD_FAST
.
At the time locals
is called (and indeed before LOAD_FAST 0
is used the first time), no assignment to x
(i.e., STORE_FAST 0
) has yet been made, so there is no local value in slot 0 to look up.