Consider a tiny properties parser snippet:
testx="""var1 = foo
var2 = bar"""
dd = { l.split('=')[0].strip():l.split('=')[1].strip() for l in testx.split('\n')}
print(dd)
# {'var1': 'foo', 'var2': 'bar'}
That works , but is ugly due to the double invocation of "split" in l.split('=')[0].strip():l.split('=')[1].strip()
. How can the dictionary comprehension be changed to only need to split once and then build the dict entries as:
l[0].strip():l[1].strip()
Does that refactoring require a nested for comprehension or a different way of constructing a single level comprehension?
If you are using Python >= 3.8 this is exactly why assignment expressions were added:
>>> {(parts:=l.split('='))[0].strip(): parts[1].strip() for l in testx.split("\n")}
{'var1': 'foo', 'var2': 'bar'}
Prior to this, you could do something like:
>>> {key.strip():value.strip() for l in testx.split('\n') for key, value in [l.split("=")]}
{'var1': 'foo', 'var2': 'bar'}
Which honestly, I find more readable.
But honestly, these are both still pretty unreadable to me. At the end of the day, I don't think you can beat:
>>> result = {}
>>> for l in testx.split("\n"):
... key, value = l.split("=")
... result[key.strip()] = value.strip()
...
>>> result
{'var1': 'foo', 'var2': 'bar'}
Note, the for <target list> in [<expression>]
idiom has actually been optimized in Python 3.9:
https://docs.python.org/3/whatsnew/3.9.html#optimizations
Optimized the idiom for assignment a temporary variable in comprehensions. Now
for y in [expr]
in comprehensions is as fast as a simple assignmenty = expr
. For example:
sums = [s for s in [0] for x in data for s in [s + x]]
Unlike the
:=
operator this idiom does not leak a variable to the outer scope.
Compare the bytecode in Pyhton 3.8 vs Pyhton 3.9, you'll notice there is no nested iteration in the Python 3.9 version:
Python 3.8:
Python 3.8.1 (default, Jan 8 2020, 16:15:59)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('{k:v for l in "a b|c d".split("|") for k,v in [l.split()]}')
1 0 LOAD_CONST 0 (<code object <dictcomp> at 0x7fdbd6249d40, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<dictcomp>')
4 MAKE_FUNCTION 0
6 LOAD_CONST 2 ('a b|c d')
8 LOAD_METHOD 0 (split)
10 LOAD_CONST 3 ('|')
12 CALL_METHOD 1
14 GET_ITER
16 CALL_FUNCTION 1
18 RETURN_VALUE
Disassembly of <code object <dictcomp> at 0x7fdbd6249d40, file "<dis>", line 1>:
1 0 BUILD_MAP 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 30 (to 36)
6 STORE_FAST 1 (l)
8 LOAD_FAST 1 (l)
10 LOAD_METHOD 0 (split)
12 CALL_METHOD 0
14 BUILD_TUPLE 1
16 GET_ITER
>> 18 FOR_ITER 14 (to 34)
20 UNPACK_SEQUENCE 2
22 STORE_FAST 2 (k)
24 STORE_FAST 3 (v)
26 LOAD_FAST 2 (k)
28 LOAD_FAST 3 (v)
30 MAP_ADD 3
32 JUMP_ABSOLUTE 18
>> 34 JUMP_ABSOLUTE 4
>> 36 RETURN_VALUE
Versus Python 3.9:
Python 3.9.0 | packaged by conda-forge | (default, Oct 14 2020, 22:56:29)
[Clang 10.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis('{k:v for l in "a b|c d".split("|") for k,v in [l.split()]}')
1 0 LOAD_CONST 0 (<code object <dictcomp> at 0x7fb3587d1870, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<dictcomp>')
4 MAKE_FUNCTION 0
6 LOAD_CONST 2 ('a b|c d')
8 LOAD_METHOD 0 (split)
10 LOAD_CONST 3 ('|')
12 CALL_METHOD 1
14 GET_ITER
16 CALL_FUNCTION 1
18 RETURN_VALUE
Disassembly of <code object <dictcomp> at 0x7fb3587d1870, file "<dis>", line 1>:
1 0 BUILD_MAP 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 22 (to 28)
6 STORE_FAST 1 (l)
8 LOAD_FAST 1 (l)
10 LOAD_METHOD 0 (split)
12 CALL_METHOD 0
14 UNPACK_SEQUENCE 2
16 STORE_FAST 2 (k)
18 STORE_FAST 3 (v)
20 LOAD_FAST 2 (k)
22 LOAD_FAST 3 (v)
24 MAP_ADD 2
26 JUMP_ABSOLUTE 4
>> 28 RETURN_VALUE