I have a dataframe like
df = pd.DataFrame({'team': ['Red', 'Blue', 'Yellow', 'Green'],
'points': [11, 7, 8, 10],
'other': ["kk;aaa;;;", ";;bb;;", ";", ";;;;"]})
I would like to replace all cells that contain only ";"
with ""
(empty cells). There may be only 1 or many ";"
in a row. If there is anything in the cells other than ";"
, I would like to leave them as is.
In my dataframe the other
column would become:
other
kk;aaa;;;
;;bb;;
#(nothing)
#(nothing)
You can use Series.replace
with a regex pattern:
df['other'] = df.other.replace(r'^;{1,}$','', regex=True)
print(df)
team points other
0 Red 11 kk;aaa;;;
1 Blue 7 ;;bb;;
2 Yellow 8
3 Green 10
Explanation pattern ^;{1,}$
:
^
asserts position at start of the string;
matches the character ;
{1,}
matches the previous token (i.e. ;
) between one and unlimited times$
asserts position at the end of the stringYou can also use this on multiple columns, with df.replace
. E.g.:
df = pd.DataFrame({'team': ['Red', 'Blue', 'Yellow', 'Green'],
'points': [11, 7, 8, 10],
'other': ["kk;aaa;;;", ";;bb;;", ";", ";;;;"],
'other2': ["kk;aaa;;;", ";;bb;;", ";", ";;;;"]})
cols = ['other','other2']
df[cols] = df[cols].replace(r'^;{1,}$','', regex=True)
print(df)
team points other other2
0 Red 11 kk;aaa;;; kk;aaa;;;
1 Blue 7 ;;bb;; ;;bb;;
2 Yellow 8
3 Green 10