I have a pandas dataframe where a value sometimes gets NA. I want to fill this column with a list of strings with the same length as another column:
import pandas as pd
import numpy as np
df = pd.DataFrame({"a": ["one", "two"],
"b": ["three", "four"],
"c": [[1, 2], [3, 4]],
"d": [[5, 6], np.nan]})
a | b | c | d |
---|---|---|---|
one | three | [1, 2] | [5, 6] |
two | four | [3, 4] | NaN |
and I want this to become
a | b | c | d |
---|---|---|---|
one | three | [1, 2] | [5, 6] |
two | four | [3, 4] | [no_value, no_value] |
I tried
df["d"] = np.where(df.d.isna(),
['no_value' for element in df.c],
df.d)
and
df["d"] = np.where(df.d.isna(),
['no_value'] * len(df.c),
df.d)
but both does not work. Anyone has an idea?
SOLUTION: I adjusted PaulS answer a little to:
df[ādā] = np.where(df.d.isna(),
pd.Series([['no_value'] * len(lst) for lst in df.c]),
df.d))
A possible solution consists in using np.where
.
df.assign(d = np.where(
df['d'].isna(),
pd.Series([['no_value'] * len(lst) for lst in df['c']]),
df['d']))
Another possible solution, which uses:
apply
on df
, iterating through each row with a lambda function that checks whether the value in column d
is NaN
.
If the condition is met, the function generates a new list filled with the string 'no_value'
repeated to match the length of the list in column c
.
If x['d']
does not meet these conditions, the original value in x['d']
is retained.
df['d'] = df.apply(
lambda x: ['no_value'] * len(x['c'])
if (x['d'] is np.nan) else x['d'], axis=1)
Output:
a b c d
0 one three [1, 2] [5, 6]
1 two four [3, 4] [no_value, no_value]