pythonmatplotlibscipy

Logistic curve produced by curve_fit is a straight line


I'm trying to produce a Sigmoid/Logistic curve from some input data. I borrowed code from this post and this for plotting.

The result is the following:

from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt

def sigmoid(x, L ,x0, k, b):
    y = L / (1 + np.exp(-k*(x-x0))) + b
    return (y)

data = np.loadtxt("data.csv", delimiter=",")
xdata = data[0]
ydata = data[1]

p0 = [max(ydata), np.median(xdata),1,min(ydata)] # this is an mandatory initial guess
fitting_parameters, covariance = curve_fit(sigmoid, xdata, ydata,p0, method='dogbox', maxfev=10000)

plt.plot(xdata, ydata, 'o', label='Data')
plt.plot(xdata, sigmoid(xdata, *fitting_parameters), '-', label='Fit')
plt.legend()
plt.show()

This produces a straight line instead of a logistic fit:

enter image description here

What am I missing? I know the data is a bit coarse, but is that the cause?

EDIT: Here is the raw data, if useful:

1.15102,1.17397,1.18334,1.18484,1.2073,1.25081,1.26446,1.26535,1.26654,1.29653,1.30118,1.30991,1.32608,1.39597,1.39721,1.41225,1.415,1.41989,1.47602,1.19992,1.23148,1.2895,1.31759,1.33068,1.34391,1.35604,1.35879,1.37359,1.38695,1.40233,1.41753,1.42323,1.43474,1.44706,1.48247,1.50033,1.52272,1.59789,1.09956,1.10712,1.13576,1.16265,1.16993,1.18129,1.19587,1.1973,1.20428,1.23916,1.24522,1.2505,1.26135,1.26542,1.27122,1.2736,1.27456,1.30306,1.34639,1.16272,1.18929,1.28076,1.28145,1.28513,1.28708,1.30215,1.30236,1.30887,1.31634,1.37677,1.37745,1.38119,1.38846,1.43016,1.43046,1.43234,1.48051,1.54508
0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95

Solution

  • When I adjust your plotting code to zoom out, it fits nicely, in the scheme of things.

    plt.plot(xdata, ydata, 'o', label='Data')
    xlim = (-3, 3)
    x = np.linspace(*xlim, 300)
    plt.plot(x, sigmoid(x, *fitting_parameters), '-', label='Fit')
    plt.xlim(*xlim)
    plt.legend()
    plt.show()
    

    enter image description here

    I suppose you want to see something like this:

    p0 = [1, np.median(xdata), 15, 0] # this is an mandatory initial guess
    
    plt.plot(xdata, ydata, 'o', label='Data')
    xlim = (min(xdata)-1, max(xdata)+1)
    x = np.linspace(*xlim, 300)
    plt.plot(x, sigmoid(x, *p0), '-', label='Fit')
    plt.xlim(*xlim)
    plt.legend()
    plt.show()
    

    enter image description here

    But if you try to use curve_fit to optimize the parameters from that guess, it makes the solution look like a straight line again, fitting to only a small portion of the curve. I think the best fit according to the least squares criterion just won't look like what you want it to look like. You would have to change the criterion from "minimize the sum of squared residuals" to something else, or add constraints on what the parameters can be (e.g. fix L and b). See, for example, scipy/scipy#21102 for some information about how to use SciPy's other optimization tools to do curve fitting manually, or you might want to look into the LMFIT package.


    Update:

    Maybe what you want is to fix L = 1 and b = 0 to reflect the fact that you're working with percentages.

    from scipy.optimize import curve_fit
    import numpy as np
    import matplotlib.pyplot as plt
    
    def sigmoid(x, x0, k):
        y = 1 / (1 + np.exp(-k*(x-x0)))
        return y
    
    xdata = np.asarray([1.15102,1.17397,1.18334,1.18484,1.2073,1.25081,1.26446,1.26535,1.26654,1.29653,1.30118,1.30991,1.32608,1.39597,1.39721,1.41225,1.415,1.41989,1.47602,1.19992,1.23148,1.2895,1.31759,1.33068,1.34391,1.35604,1.35879,1.37359,1.38695,1.40233,1.41753,1.42323,1.43474,1.44706,1.48247,1.50033,1.52272,1.59789,1.09956,1.10712,1.13576,1.16265,1.16993,1.18129,1.19587,1.1973,1.20428,1.23916,1.24522,1.2505,1.26135,1.26542,1.27122,1.2736,1.27456,1.30306,1.34639,1.16272,1.18929,1.28076,1.28145,1.28513,1.28708,1.30215,1.30236,1.30887,1.31634,1.37677,1.37745,1.38119,1.38846,1.43016,1.43046,1.43234,1.48051,1.54508])
    ydata = np.asarray([0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95,0.05,0.1,0.15,0.2,0.25,0.3,0.35,0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8,0.85,0.9,0.95])
    
    p0 = [np.median(xdata), 15] # this is an mandatory initial guess
    fitting_parameters, covariance = curve_fit(sigmoid, xdata, ydata,p0, method='dogbox', maxfev=10000)
    
    plt.plot(xdata, ydata, 'o', label='Data')
    xlim = (min(xdata)-1, max(xdata)+1)
    x = np.linspace(*xlim, 300)
    plt.plot(x, sigmoid(x, *fitting_parameters), '-', label='Fit')
    plt.xlim(*xlim)
    plt.legend()
    plt.show()
    

    Then the best fit is:

    enter image description here


    In response to your follow-up question, to solve for x at a particular y, you can invert the equation algebraically:

    p = 0.2
    x0, k = fitting_parameters
    -np.log(1 / p - 1) / k + x0
    # 1.1584171856437602
    

    Or you can solve numerically:

    from scipy.optimize import root_scalar
    
    def f(x, p):
        return sigmoid(x, *fitting_parameters) - p
    
    p = 0.2
    root_scalar(f, bracket=[0.5, 2.0], args=(p,))
    #       converged: True
    #            flag: converged
    #  function_calls: 12
    #      iterations: 11
    #            root: 1.1584171856442531
    #          method: brentq