pythonpandas

Linear interpolation lookup of a dataframe


I have a dataframe using pandas, something like:

d = {'X': [1, 2, 3], 'Y': [220, 187, 170]}
df = pd.DataFrame(data=d)

the dataframe ends up like

X Y
1 220
2 187
3 170

I can get the y value for an x value of 1.0 using

df[df['X'] == 1.0]['Y']

which returns 220

But is there a way to get a linearly interpolated value of Y for an X value between values of X? For example, if I had an X value of 1.5, I would want it to return an interpolated value of 203.5.

I tried the interpolate function, but it permanently adjusts the data in the dataframe. I could also write a separate function that would calculate this, but I was wondering if there was a native function in pandas.


Solution

  • You can use np.interp for this:

    import numpy as np
    
    X_value = 1.5
    
    np.interp(X_value, df['X'], df['Y'])
    # 203.5
    

    Make sure that df['X'] is monotonically increasing.

    You can use the left and right parameters to customize the return value for out-of-bounds values:

    np.interp(0.5, df['X'], df['Y'], left=np.inf, right=-np.inf)
    # inf
    
    # because 0.5 < df['X'].iloc[0]
    

    By default, out-of-bounds values will correspond to the closest valid X value:

    np.interp(10, df['X'], df['Y'])
    # 170
    
    # i.e., match for df['X'].iloc[-1]