pythonmatplotlibcurve-fittingpower-law

Power law data fitting is not correct


What I'm doing wrong while trying to fit the following data, with logarithmic y-axis. The code and the produced graph are below.

enter image description here

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x, a, b, c):
    return a * np.exp(-b * x) + c

fig, ax = plt.subplots()
x = np.array([88.08064516, 264.24193548, 440.40322581, 616.56451613, 792.72580645, 968.88709677, 1145.0483871, 1321.20967742, 1497.37096774, 1673.53225806, 1849.69354839, 2025.85483871, 2202.01612903, 2378.17741935, 2554.33870968, 2730.5, 2906.66129032, 3082.82258065, 3258.98387097, 3435.14516129, 3611.30645161, 3787.46774194, 3963.62903226, 4139.79032258, 4315.9516129, 4492.11290323, 4668.27419355, 4844.43548387, 5020.59677419, 5196.75806452, 5372.91935484, 5549.08064516])
y = np.array([210737, 2175, 514, 158, 90, 46, 27, 22, 10, 11, 3, 7, 3, 2, 0, 1, 1, 1, 0, 0, 1, 0, 0,0, 0, 0, 1, 0, 0, 0, 0,1])
popt, pcov = curve_fit(func, x, y)
ax.plot(x, func(x, *popt), 'g--')
ax.plot(x,  y, 'ro', label='data')

ax.set_yscale('log')  # I need to have the y-axis logarithmic
plt.show()

Solution

  • Your data is very challenging to fit without an initial guess. So add provide the guess when calling curve_fit as p0 (as described in the docs):

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.optimize import curve_fit
    
    def func(x, a, b, c):
        return a * np.exp(-b * x) + c
    
    x = np.array([88.08064516, 264.24193548, 440.40322581, 616.56451613, 792.72580645, 968.88709677, 1145.0483871, 1321.20967742, 1497.37096774, 1673.53225806, 1849.69354839, 2025.85483871, 2202.01612903, 2378.17741935, 2554.33870968, 2730.5, 2906.66129032,3082.82258065, 3258.98387097, 3435.14516129, 3611.30645161, 3787.46774194, 3963.62903226, 4139.79032258, 4315.9516129, 4492.11290323, 4668.27419355, 4844.43548387, 5020.59677419, 5196.75806452, 5372.91935484, 5549.08064516])
    y = np.array([210737, 2175, 514, 158, 90, 46, 27, 22, 10, 11, 3, 7, 3, 2, 0, 1, 1, 1, 0, 0, 1, 0, 0,0, 0, 0, 1, 0, 0, 0, 0,1])
    
    p0 = [20000,0.003,1]
    popt, pcov = curve_fit(func, x, y, p0=p0)
    
    fig, ax = plt.subplots()
    ax.plot(x, func(x, *popt), 'g--', label = 'fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
    ax.plot(x,  y, 'ro', label='data')
    

    Output:

    enter image description here

    Note this helps you to troubleshoot the problem. You still have to implement the logarithmic axis. For log. fitting I would advise pruning the data like,

    y_pruned = np.where(y<1, 1, y)
    popt, pcov = curve_fit(func, x, np.log(y_pruned), p0=p0)
    ax.plot(x, func(x, *popt), 'g--', label = 'fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
    ax.plot(x,  np.log(y_pruned), 'ro', label='data')
    

    This yields:

    enter image description here