pythoncurve-fittingleast-squaresscipy-optimizecovariance-matrix

scipy.optimize.leastsq in Python not returning covariance matrix when fitting data


I am using optimize.leastsq to fit the data I have collected from a Mossbauer Spectroscopy experiment. The data was successfully fit, and the best fit parameters returned are good. But the second output cov_x, as called in the documentation, is given as an integer of 1, when I imagine that I am supposed to get a matrix. Why is this happening and how do I get the correct matrix?

Edit: @slothrop clarified that the output that I thought was cov_x is actually ier, an integer representing an error. 1 stands for ["Both actual and predicted relative reductions in the sum of squares\n are at most %f" % ftol, None]. Could anyone help me understand what this means?

Edit 2: Solved. @slothrop politely informed me that I have to pass the parameter full_output=1 in order for optimize.leastsq to return cov_x.

P.S. I don't want to use optimize.curve_fit because I want to pass the parameters to my model function as a list rather than individual variables.

documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html

Here is my code. Parts of it may seem unnecessary due to my wanting to generalize to arbitrary start, stop points.

import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize, signal

# sample of data

x = np.array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179])
y = np.array([17714, 17582, 17377, 17292, 17210, 17248, 16667, 16479, 16609, 16461, 16584, 16898, 17033, 17162, 17231, 17349, 17420, 17762, 17650, 17511, 17607, 17525, 17818, 17628, 17576, 17576, 17865, 17963, 18005, 17860, 17769, 17935, 17829, 17770, 17847, 17855, 18085, 18123, 17706, 17790, 17766, 17864, 17864, 18094, 17826, 17662, 17693, 17767, 17905, 17805, 17738, 17812, 17690, 17722, 17747, 17679, 17763, 17405, 17570, 17396, 17291, 16892, 16727, 16797, 17003, 17180, 17245, 17436, 17565, 17388, 17522, 17504, 17715, 17585, 17782, 17652, 17736, 17897, 17766, 17789])


def lorentz(x, xpeak, linewidth, intensity):    # parameters: xpeak, linewidth, intensity
    return -intensity*(linewidth/2)**2/((x-xpeak)**2+(linewidth/2)**2)

def curve_fit(start,stop):
    
    xp_est = signal.find_peaks(-y[start:stop], width=5, height=-17250)[0]    # parameter estimates
    xp_est = start + xp_est
    npeaks = np.size(xp_est)
    lw_est = np.full(npeaks,7)
    I_est = (17800-y[xp_est])
    estimates = [xp_est, lw_est, I_est]
        
    def model(x, par):    # sum of lorentz distributions of peaks
        
        xpeak = par[:npeaks]
        linewidth = par[npeaks:2*npeaks]
        intensity = par[2*npeaks:]
        
        lorentz_sum = 17800
        for i in range(npeaks):
            lorentz_sum = lorentz_sum + lorentz(x,xpeak[i],linewidth[i],intensity[i])
        return lorentz_sum
    
    def residual(par, x, y):     
        return y - model(x, par)
    
    par, cov = optimize.leastsq(residual, estimates, (x[start:stop], y[start:stop]))
    plt.plot(x[start:stop], y[start:stop], '.', markersize=2)
    plt.plot(x[start:stop], model(x[start:stop], par))
    
    return par, cov

par, cov = curve_fit(0, 80)
print(par, type(par))
print(cov, type(cov))
plt.show()

Output

[ 162.67354556  108.5074825     5.99266549    7.74311055 1048.1092175
 1361.99804297] <class 'numpy.ndarray'>
1 <class 'int'>
[link]

https://i.sstatic.net/Ixqql.png


Solution

  • By default, it only outputs the result (1d array) and an integer flag (the integer number you got). If you want the full output, use this:

    x0, cov, info, msg, status = optimize.leastsq(residual, estimates, (x[start:stop], y[start:stop]), full_output=True)
    

    cov is what you want

    You have to pass the full_output=True parameter in the leastsq() function