I have a table like that:
Column1 | Column2 | Column3 | Column4 | Column5 |
---|---|---|---|---|
100 | John | [-, 1] | [brown, yellow] | [nan, nan] |
200 | Stefan | [nan, 2] | [nan, yellow] | [-, accepted] |
As you can see Columns 3-5 are made of lists entirely and what I want is to remove dash (-) along with "nan" elements from the lists in those columns.
So the output should look like this at the end:
Column1 | Column2 | Column3 | Column4 | Column5 |
---|---|---|---|---|
100 | John | [1] | [brown, yellow] | [] |
200 | Stefan | [2] | [yellow] | [accepted] |
The closest to this outcome I was able to get with the following function:
Table1["Column3"] = Table1["Column3"].apply(lambda x: [el for el in x if el != '-' if pd.isnull(el) == False])
But the problem with it is that I don't know how to apply it for all the columns that are made out of lists in the DataFrame. This is simplified example, in the original I have nearly 15 columns and was wondering if there is a way to achieve it, instead of writing a function like that separately for all 15 columns.
Another solution:
for c in df.columns:
df[c] = df[c].apply(
lambda x: [v for v in x if v != "-" and pd.notna(v)]
if isinstance(x, list)
else x
)
print(df)
Prints:
Column1 Column2 Column3 Column4 Column5
0 100 John [1] [brown, yellow] []
1 200 Stefan [2] [yellow] [accepted]