pythonexcelpandasinterpolationcubic-spline

Scipy Cubicspline vs Real Stats using Excel Spline


I'm currently running some interpolations (Cubic spline interpolations) on python and excel. Given limitations on how some of the data needs to be displayed and is obtained, I need to run a cubic spline interpolation on python for some data and using the add in from 'Real Statistics using Excel' for some others.

However as a test, i tried running the cubic spline interpolation using the same data for both and ended up getting results that was different.

new_df_expanded.interpolate(method='CubicSpline')

is how I'm pulling the CubicSpline method from pandas, which from my understanding uses the Scipy Library to do the CubicSpline.

https://www.real-statistics.com/other-mathematical-topics/spline-fitting-interpolation/

has a function called Spline(), where i just need to input the arrays with the data points and it will automatically run the interpolation.

Is there some logic difference between the way they both run the Spline interpolations?


Solution

  • I was actually able to get the issue fixed. The pandas interpolate method is actually utilizing the interpolate method from scipy, which sets the boundary condition to 'not a knot' as a default. This makes it such that the first and second segment at a curve end are the same polynomial. To make it consistent with the logic on the excel add-in, you need to change the 'bc_type' to 'natural' so that the second derivative at curve ends are zero.