I'm trying to produce a Sigmoid/Logistic curve from some input data. I borrowed code from this post and this for plotting.
The result is the following:
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x, L ,x0, k, b):
y = L / (1 + np.exp(-k*(x-x0))) + b
return (y)
data = np.loadtxt("data.csv", delimiter=",")
xdata = data[0]
ydata = data[1]
p0 = [max(ydata), np.median(xdata),1,min(ydata)] # this is an mandatory initial guess
fitting_parameters, covariance = curve_fit(sigmoid, xdata, ydata,p0, method='dogbox', maxfev=10000)
plt.plot(xdata, ydata, 'o', label='Data')
plt.plot(xdata, sigmoid(xdata, *fitting_parameters), '-', label='Fit')
plt.legend()
plt.show()
This produces a straight line instead of a logistic fit:
What am I missing? I know the data is a bit coarse, but is that the cause?
EDIT: Here is the raw data, if useful:
1.15102,1.17397,1.18334,1.18484,1.2073,1.25081,1.26446,1.26535,1.26654,1.29653,1.30118,1.30991,1.32608,1.39597,1.39721,1.41225,1.415,1.41989,1.47602,1.19992,1.23148,1.2895,1.31759,1.33068,1.34391,1.35604,1.35879,1.37359,1.38695,1.40233,1.41753,1.42323,1.43474,1.44706,1.48247,1.50033,1.52272,1.59789,1.09956,1.10712,1.13576,1.16265,1.16993,1.18129,1.19587,1.1973,1.20428,1.23916,1.24522,1.2505,1.26135,1.26542,1.27122,1.2736,1.27456,1.30306,1.34639,1.16272,1.18929,1.28076,1.28145,1.28513,1.28708,1.30215,1.30236,1.30887,1.31634,1.37677,1.37745,1.38119,1.38846,1.43016,1.43046,1.43234,1.48051,1.54508
0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95
When I adjust your plotting code to zoom out, it fits nicely, in the scheme of things.
plt.plot(xdata, ydata, 'o', label='Data')
xlim = (-3, 3)
x = np.linspace(*xlim, 300)
plt.plot(x, sigmoid(x, *fitting_parameters), '-', label='Fit')
plt.xlim(*xlim)
plt.legend()
plt.show()
I suppose you want to see something like this:
p0 = [1, np.median(xdata), 15, 0] # this is an mandatory initial guess
plt.plot(xdata, ydata, 'o', label='Data')
xlim = (min(xdata)-1, max(xdata)+1)
x = np.linspace(*xlim, 300)
plt.plot(x, sigmoid(x, *p0), '-', label='Fit')
plt.xlim(*xlim)
plt.legend()
plt.show()
But if you try to use curve_fit
to optimize the parameters from that guess, it makes the solution look like a straight line again, fitting to only a small portion of the curve. I think the best fit according to the least squares criterion just won't look like what you want it to look like. You would have to change the criterion from "minimize the sum of squared residuals" to something else, or add constraints on what the parameters can be (e.g. fix L
and b
). See, for example, scipy/scipy#21102 for some information about how to use SciPy's other optimization tools to do curve fitting manually, or you might want to look into the LMFIT package.
Update:
Maybe what you want is to fix L = 1
and b = 0
to reflect the fact that you're working with percentages.
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x, x0, k):
y = 1 / (1 + np.exp(-k*(x-x0)))
return y
xdata = np.asarray([1.15102,1.17397,1.18334,1.18484,1.2073,1.25081,1.26446,1.26535,1.26654,1.29653,1.30118,1.30991,1.32608,1.39597,1.39721,1.41225,1.415,1.41989,1.47602,1.19992,1.23148,1.2895,1.31759,1.33068,1.34391,1.35604,1.35879,1.37359,1.38695,1.40233,1.41753,1.42323,1.43474,1.44706,1.48247,1.50033,1.52272,1.59789,1.09956,1.10712,1.13576,1.16265,1.16993,1.18129,1.19587,1.1973,1.20428,1.23916,1.24522,1.2505,1.26135,1.26542,1.27122,1.2736,1.27456,1.30306,1.34639,1.16272,1.18929,1.28076,1.28145,1.28513,1.28708,1.30215,1.30236,1.30887,1.31634,1.37677,1.37745,1.38119,1.38846,1.43016,1.43046,1.43234,1.48051,1.54508])
ydata = np.asarray([0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95])
p0 = [np.median(xdata), 15] # this is an mandatory initial guess
fitting_parameters, covariance = curve_fit(sigmoid, xdata, ydata,p0, method='dogbox', maxfev=10000)
plt.plot(xdata, ydata, 'o', label='Data')
xlim = (min(xdata)-1, max(xdata)+1)
x = np.linspace(*xlim, 300)
plt.plot(x, sigmoid(x, *fitting_parameters), '-', label='Fit')
plt.xlim(*xlim)
plt.legend()
plt.show()
Then the best fit is:
In response to your follow-up question, to solve for x
at a particular y
, you can invert the equation algebraically:
p = 0.2
x0, k = fitting_parameters
-np.log(1 / p - 1) / k + x0
# 1.1584171856437602
Or you can solve numerically:
from scipy.optimize import root_scalar
def f(x, p):
return sigmoid(x, *fitting_parameters) - p
p = 0.2
root_scalar(f, bracket=[0.5, 2.0], args=(p,))
# converged: True
# flag: converged
# function_calls: 12
# iterations: 11
# root: 1.1584171856442531
# method: brentq