pythonpandasnumpy

Use np.where to create a list with same number of elements, but different content


I have a pandas dataframe where a value sometimes gets NA. I want to fill this column with a list of strings with the same length as another column:

import pandas as pd
import numpy as np

df = pd.DataFrame({"a": ["one", "two"],
                   "b": ["three", "four"],
                   "c": [[1, 2], [3, 4]],
                   "d": [[5, 6], np.nan]})
a b c d
one three [1, 2] [5, 6]
two four [3, 4] NaN

and I want this to become

a b c d
one three [1, 2] [5, 6]
two four [3, 4] [no_value, no_value]

I tried

df["d"] = np.where(df.d.isna(),
                   ['no_value' for element in df.c],
                   df.d)

and

df["d"] = np.where(df.d.isna(),
                   ['no_value'] * len(df.c),
                   df.d)

but both does not work. Anyone has an idea?

SOLUTION: I adjusted PaulS answer a little to:

df[ā€˜dā€˜] = np.where(df.d.isna(), 
                   pd.Series([['no_value'] * len(lst) for lst in df.c]), 
                   df.d))

Solution

  • A possible solution consists in using np.where.

    df.assign(d = np.where(
        df['d'].isna(), 
        pd.Series([['no_value'] * len(lst) for lst in df['c']]), 
        df['d']))
    

    Another possible solution, which uses:

    df['d'] = df.apply(
        lambda x: ['no_value'] * len(x['c']) 
        if (x['d'] is np.nan) else x['d'], axis=1)
    

    Output:

         a      b       c                     d
    0  one  three  [1, 2]                [5, 6]
    1  two   four  [3, 4]  [no_value, no_value]