I would like to understand why this works fine:
>>> test_string = 'long brown fox jump over a lazy python'
>>> 'formatted "{test_string[0]}"'.format(test_string=test_string)
'formatted "l"'
Yet this fails:
>>> 'formatted "{test_string[-1]}"'.format(test_string=test_string)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers
>>> 'formatted "{test_string[11:14]}"'.format(test_string=test_string)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers
I know this could be used:
'formatted "{test_string}"'.format(test_string=test_string[11:14])
...but that is not possible in my situation.
I am dealing with a sandbox-like environment where a list of variables is passed to str.format()
as dictionary of kwargs. These variables are outside of my control. I know the names and types of variables in advance and can only pass formatter string. The formatter string is my only input. It all works fine when I need to combine a few strings or manipulate numbers and their precision. But it all falls apart when I need to extract a substring.
This is explained in the spec of str.format()
:
The arg_name can be followed by any number of index or attribute expressions. An expression of the form '.name' selects the named attribute using
getattr()
, while an expression of the form '[index]' does an index lookup using__getitem__()
.
That is, you can index the string using bracket notation, and the index you put inside the brackets will be the argument of the __getitem__()
method of the string. This is indexing, not slicing. The bottom line is that str.format()
simply doesn't support slicing of the replacement field (= the part between {}
), as this functionality isn't part of spec.
Regarding negative indices, the grammar specifies:
element_index ::= digit+ | index_string
This means that the index can either be a sequence of digits (digit+
) or a string. Since any negative index such as -1
is not a sequence of digits, it will be parsed as index_string
. However, str.__getitem__()
only supports arguments of type integer. Hence the error TypeError: string indices must be integers, not 'str'
.
>>> test_string = 'long brown fox jump over a lazy python'
>>> f"formatted {test_string[0]}"
'formatted l'
>>> f"formatted {test_string[0:2]}"
'formatted lo'
>>> f"formatted {test_string[-1]}"
'formatted n'
str.format()
but slice the argument of str.format()
directly, rather than the replacement field>>> test_string = 'long brown fox jump over a lazy python'
>>> 'formatted {replacement}'.format(replacement=test_string[0:2])
'formatted lo'
>>> 'formatted {replacement}'.format(replacement=test_string[-1])
'formatted n'