pythoncurve-fittingcovariancepower-lawloglog

Python - wrrong fit


I am trying to reproduce a known fitting result (reported in a journal paper): applying a power-law model to data. As can be seen in the plot-A below, I was able to reproduce the result by using known best-fit parameters.

< Plot-A: the known result from the literature > Both axis: log-log scale

However, I could not re-derive the best-fit parameters by myself.

< Plot-B: an incorrect fit from curve_fit and lmfit > wrong

The case-A returns,

OptimizeWarning: Covariance of the parameters could not be estimated (if I omit several initial data points, the fit returns some results which are not bad, but still different from the known best-fit result). EDIT: now, i just found new additional error messages this time.. : (1) RuntimeWarning: overflow encountered in power (2) RuntimeWarning: invalid value encountered in power

The case-B (with initial guess more closer to the best-fit parameters) returns,

RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 5000. If I set the maxfev to be much higher to take account of this error message, the fit works but returns an incorrect result (very wrong fit compared to the best-fit result).


Solution

  • As I commented:

    Since you plot the data on a log-log plot, are you also fitting with the log(y) and log(x)? Since your y data varies by 5 or 6 orders of magnitude, if you are not fitting in log-space, only those 3 or 4 data points with the highest y value will matter.

    Apparently that was a bit too subtle of a hint. So I will be more direct: if you are plotting in log-space, FIT IN LOG SPACE.

    But also, your model is very prone to generating complex numbers from negative**fraction and NaNs, which was no doubt causing problems with all your fits. ALWAYS print out the best-fit parameters and covariance matrix.

    So, you may need to impose bounds on your parameters (of course, I have no idea if your model is right or is actually what you think the "right answer" used). Maybe start with something like this:

    import matplotlib.pyplot as plt
    from lmfit import Model
    import numpy as np
    
    # the data can be found at the end of this post.
    xx, yy = np.genfromtxt('the_data', unpack=True)
    
    # the BPL function
    def bendp(x, A, x_bend, allo, alhi, c):
        numer = A * x**-allo
        denom = 1 + (x/x_bend)**(alhi-allo)
        out = (numer/denom) + c
        return  np.log(out)       ## <- TAKE THE LOG
    
    
    mod = Model(bendp)
    para = mod.make_params(A = 0.01, x_bend=1e-2, allo=1., alhi=2., c=1e-2)
    
    # Note: your model is very sensitive # make sure A, x_bend, alhi, and c cannot be negative
    para['A'].min = 0
    para['x_bend'].min = 0
    para['alhi'].min = 0
    para['alhi'].max = 3
    para['c'].min = 0
    
    
    result = mod.fit(np.log(yy), para, x=xx) ## <- FIT THE LOG
    
    print(result.fit_report())
    
    plt.loglog(xx, yy, linewidth=0, marker='o', markersize=6)
    plt.loglog(xx, np.exp(result.best_fit), 'r', linewidth=5)
    plt.show()
    

    Hope that helps...