[SOLVED] Dataframe Replace NaN whith random in range

Dataframe Replace NaN whith random in range

I have a Dataframe in Python whith NaN, as this:

import pandas as pd
details = {
    'info1' : [10,None,None,None,None,None,15,None,None,None,5],
    'info2' : [15,None,None,None,10,None,None,None,None,None,20],
}
df = pd.DataFrame(details)
print(df)

	info1	info2
0	10	15
1	nan	nan
2	nan	nan
3	nan	nan
4	nan	10
5	nan	nan
6	15	nan
7	nan	nan
8	nan	nan
9	nan	nan
10	5	20

How to replace NaNs with the random number (e.g., uniform) in a specific range (based on rows that have values), as this:

Solution

For a vectorial solution, directly call np.random.uniform with the ffill/bfill as boundaries:

import numpy as np

df[:] = np.random.uniform(df.ffill(), df.bfill())

Output (with np.random.seed(0)):

        info1      info2
0   10.000000  15.000000
1   13.013817  12.275584
2   12.118274  11.770529
3   12.187936  10.541135
4   14.818314  10.000000
5   13.958625  15.288949
6   15.000000  19.255966
7   14.289639  10.871293
8   14.797816  18.326198
9    7.218432  18.700121
10   5.000000  20.000000

Note

As pointed out by @wjandrea, the behavior of uniform is not officially supported when low > high*. If you want a robust solution, use my original approach with an intermediate array and sort:

import numpy as np

# ensure the low boundary is before the high
tmp = np.sort(np.dstack([df.ffill(),
                         df.bfill()]),
              axis=2)

# generate random numbers between low and high
df[:] = np.random.uniform(tmp[..., 0], tmp[..., 1])

(*)

If high < low, the results are officially undefined and may eventually raise an error, i.e. do not rely on this function to behave when passed arguments satisfying that inequality condition.

	info1	info2
0	10	15
1	nan	nan
2	nan	nan
3	nan	nan
4	nan	10
5	nan	nan
6	15	nan
7	nan	nan
8	nan	nan
9	nan	nan
10	5	20

	info1	info2
0	10	15
1	nan	nan
2	nan	nan
3	nan	nan
4	nan	10
5	nan	nan
6	15	nan
7	nan	nan
8	nan	nan
9	nan	nan
10	5	20

	info1	info2
0	10	15
1	nan	nan
2	nan	nan
3	nan	nan
4	nan	10
5	nan	nan
6	15	nan
7	nan	nan
8	nan	nan
9	nan	nan
10	5	20