I'm using a quadratic function to fit my data. I have good R2 score but huge uncertainty in my fitting parameters
Here is the graph and the results:
R2 score: 0.9698143924536671
uncertainty in a, b, and y0: 116.93787913, 10647.11867972, 116.93787935
How should I intepret this result?
Here is how I defined the quadratic function:
def my_quad(x, a, b, y0):
return a*(1-x**2/(2*b**2))+ y0
Here's how I calculated the uncertainty for the parameters and R2 score:
popt, pcov = curve_fit(my_quad, x_data,y_data, bounds=([0, 0, -np.inf], [np.inf, np.inf, np.inf]))
a, b, y= popt
err = np.sqrt(np.diag(pcov))
y_pred = my_quad(x_data, *popt)
r2 = r2_score(y_data, y_pred))
Your model is over-parametrized. You can tell when you expand the polynomial:
a * (1 - x**2 / (2*b**2)) + y0 ->
a - x**2 * a / (2*b**2) + y0 ->
y0+a - x**2 * a / (2*b**2)
The are only two independent parameters, y0 + a
and a / (2*b**2)
.
You will be able to fit just as well with any two of your original parameters, and then the uncertainty will be reduced significantly.
For example:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy.optimize import curve_fit
# generate data
rng = np.random.default_rng(23457834572346)
x = np.linspace(-1, 1, 30)
noise = 0.05 * rng.standard_normal(size=x.shape)
y = -2*x**2 + 1 + noise
# over-parameterized fit
def my_quad(x, a, b, y0):
return a*(1-x**2/(2*b**2))+ y0
bounds=([0, 0, -np.inf], [np.inf, np.inf, np.inf])
popt, pcov = curve_fit(my_quad, x, y, bounds=bounds)
err = np.sqrt(np.diag(pcov)) # array([3028947.74320428, 544624.83253159, 3028947.74412785])
y_ = my_quad(x, *popt)
r2 = np.corrcoef(y, y_)[0, 1] # 0.9968876754155439
# remove any one parameter
def my_quad(x, a, b):
return a*(1-x**2/(2*b**2))
bounds=([0, 0], [np.inf, np.inf])
popt, pcov = curve_fit(my_quad, x, y, bounds=bounds)
err = np.sqrt(np.diag(pcov)) # array([0.01460553, 0.00260903])
y_ = my_quad(x, *popt)
r2 = np.corrcoef(y, y_)[0, 1] # 0.9968876754155439
# plot results
plt.plot(x, y, '.')
plt.plot(x, y_, '-')
plt.plot(x, y_2, '--')