Suppose we want to replace multiple substrings via pd.Series.replace or pd.DataFrame.replace by passing a dictionary to the to_replace
argument
Example:
Replace
in the string 'Nana likes bananas and ananas'.
Let's try a short example:
s = pd.Series(['abcde', 'bcde', 'xyz'])
s.replace(to_replace={'ab': 'xy', 'bc': 'BC', 'cd': 'CD', 'xy': 'XY'}, regex=True)
0 xyCDe
1 BCde
2 XYz
dtype: object
xy
that is replacing ab
is not further replaced by XY
.# let's swap the first two keys
s.replace(to_replace={'bc': 'BC', 'ab': 'xy', 'cd': 'CD', 'xy': 'XY'}, regex=True)
0 aBCde
1 BCde
2 XYz
dtype: object
ab
vs bc
in abc
). Below are other examples.# overlapping regex, with lookarounds
pd.Series(['abcde']).replace(to_replace={'a(?=b)': 'A', '(?<=b)c': 'C'}, regex=True)
0 AbCde
dtype: object
# overlapping regex in which the first pattern breaks the second one
pd.Series(['abcde']).replace(to_replace={'ab': 'A', '(?<=b)c': 'C'}, regex=True)
0 Acde
dtype: object
# overlapping pattern in which the replacement preserves the second pattern
pd.Series(['abcde']).replace(to_replace={'ab': 'Ab', '(?<=b)c': 'C'}, regex=True)
0 AbCde
dtype: object
# overlapping pattern in which the replacement creates the second pattern
pd.Series(['abcde']).replace(to_replace={'ab': 'Ax', '(?<=x)c': 'C'}, regex=True)
0 Axcde
dtype: object