I have the below dataset:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
## Given datapoints
xdata = np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26])
ydata = np.array([1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.99330715, 0.98201379, 0.95257413, 0.88079708, 0.73105858, 0.5])
## Plot the data
plt.plot(xdata, ydata)
plt.xlabel('x')
plt.ylabel('sigmoid(x)')
plt.xlim([-1,31])
plt.ylim(0, 1.05)
plt.show()
The above data looks as such:
A curve needs to be caliberated and extrapolated for y decreasing from 1 to 0 by using curve_fit in python.
I am trying to use sigmoid function provided that 'y' is given and 'x' need to be found.
The sigmoid function to fit 'x' is thus defined as such:
## Define sigmoid function to fit xdata
def sigmoid(y, x0, k):
x = x0 + ((1/k)*(np.log((1/y)-1)))
return x
## Initial guess
p0 = [np.median(xdata), # x0
0.1] # k
## Initialize curve fit
popt, pcov = curve_fit(sigmoid,
ydata,
xdata)
## Define values for y
y = np.arange(1,0,-0.001)
## Evaluate values for x
x = sigmoid(y, *popt)
## Plot tbe actual and fit data
plt.plot(xdata, ydata, 'o', label='data')
plt.plot(x,y, label='fit')
plt.xlim([-10,31])
plt.ylim(0, 1.05)
plt.legend(loc='best')
plt.show()
The fit data looks as such:
It is quite visible that the fit is not good.
Can somebody please let me know how I can fit a curve close to actual data?
You might find lmfit
useful for this (disclaimer: I am a lead author), as it has sigmoidal step functions built in and can readily do fits and use those results to interpolate or extrapolate. It also gives more useful reports of results than curve_fit
, with variable parameters that have meaningful names and uncertainties and correlations correctly calculated, sorted and assigned.
Your example might look like this:
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import StepModel, ConstantModel
#
xdata = np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26])
ydata = np.array([1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.99330715, 0.98201379, 0.95257413, 0.88079708, 0.73105858, 0.5])
# create a logistic step function + an offset constant
model = StepModel(form='logistic') + ConstantModel()
# make a set of parameters with initial values
params = model.make_params(c=1, amplitude=-1, center=25, sigma=2.5)
# fit the data with the parameters and `x` independent variable
result = model.fit(ydata, params, x=xdata)
# print results
print(result.fit_report())
# evaluate the model with best-fit parameters to
# interpolate and extrapolate to higher x values
xnew = np.linspace(15, 45, 121)
ynew = model.eval(result.params, x=xnew)
## Plot the data, best-fit result, and extrapolated data
plt.plot(xdata, ydata, 'o', label='data')
plt.plot(xdata, result.best_fit, '-', label='best fit')
plt.plot(xnew, ynew, '+', label='exptrapolated')
plt.xlabel('x')
plt.ylabel('sigmoid(x)')
plt.xlim([-1,41])
plt.ylim(0, 1.05)
plt.legend()
plt.show()
which would print a report of
[[Model]]
(Model(step, form='logistic') + Model(constant))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 82
# data points = 27
# variables = 4
chi-square = 6.4177e-06
reduced chi-square = 2.7903e-07
Akaike info crit = -403.811763
Bayesian info crit = -398.628416
R-squared = 0.99997896
[[Variables]]
amplitude: -0.99669011 +/- 0.01320863 (1.33%) (init = -1)
center: 25.9928183 +/- 0.02578216 (0.10%) (init = 25)
sigma: 0.99867632 +/- 0.00629747 (0.63%) (init = 2.5)
c: 1.00015992 +/- 1.1486e-04 (0.01%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
C(amplitude, center) = -0.997
C(center, sigma) = 0.954
C(amplitude, sigma) = -0.943
C(sigma, c) = 0.287
C(amplitude, c) = -0.223