convolutiongaussianerrorbarlmfitlevenberg-marquardt

Fitting data with Voigt profile in lmfit in python - huge errors


I am trying to fit some RIXS data with Voigt profiles (lmfit in Python), and I have defined the Voigt profile in the following way:


def gfunction_norm(x, pos, gwid):
    gauss= (1/(gwid*(np.sqrt(2*np.pi))))*(np.exp((-1.0/2)*((((x-pos)/gwid))**2)))
    return (gauss-gauss.min())/(gauss.max()-gauss.min())

def lfunction_norm(x,pos,lwid):
    lorentz=(0.15915*lwid)/((x-pos)**2+0.25*lwid**2)
    return (lorentz-lorentz.min())/(lorentz.max()-lorentz.min())



def voigt(x, pos, gwid, lwid, int):
    step=0.005
    x2=np.arange(pos-7,pos+7+step,step)
    voigt3=np.convolve(gfunction_norm(x2, pos, gwid), lfunction_norm(x2, pos, lwid), mode='same')   
    norm=(voigt3-voigt3.min())/(voigt3.max()-voigt3.min()) 
    y=np.interp(energy, x2, norm)
    return y * int

I have used this definition instead of the popular Voigt profile definition in Python:

def voigt(x, alpha, cen, gamma): 
    sigma=alpha/np.sqrt(2*np.log(2))
    return np.real(wofz((x-cen+1j*gamma)/sigma/np.sqrt(2)))/(sigma*2.51)

because it gives me more clarity on the intensity of the peaks etc.

Now, I have a couple of spectra with 9-10 peaks and I am trying to fit all of them with Voigt profiles (precisely in the way I defined it).

Now, I have a couple of questions:

  1. Do you think my Voigt definition is OK? What (dis)advantages do I have by using the convolution instead of the approximative second definition?

  2. As a result of my fit, sometimes I get crazy large standard deviations. For example, these are best-fit parameters for one of the peaks:

    int8:    0.00986265 +/- 0.00113104 (11.47%) (init = 0.05)
    pos8:   -2.57960013 +/- 0.00790640 (0.31%) (init = -2.6)
    gwid8:   0.06613237 +/- 0.02558441 (38.69%) (init = 0.1)
    lwid8:   1.0909e-04 +/- 1.48706395 (1363160.91%) (init = 0.001)

(intensity, position, gaussian and lorentzian width respectively). Does this output mean that this peak should be purely gaussian?

  1. I have noticed that large errors usually happen when the best-fit parameter is very small. Does this have something to do with the Levenberg-Marquardt algorithm that is used by default in lmfit? I should note that I sometimes have the same problem even when I use the other definition of the Voigt profile (and not just for Lorentzian widths). Here is a part of the code (it is a part of a bigger code and it is in a for loop, meaning I use same initial values for multiple similar spectra):
    model = Model(final)
    result = model.fit(spectra[:,nb_spectra], params, x=energy)
    print(result.fit_report())

"final" is the sum of many voigt profiles that I previously defined.

Thank you!


Solution

  • This seems a duplicate or follow-up to Lmfit fit produces huge uncertainties - please use on SO question per topic.

    Do you think my Voigt definition is OK? What (dis)advantages do I have by using the convolution instead of the approximative second definition?

    What makes you say that the second definition is approximate? In some sense, all floating-point calculations are approximate, but the Faddeeva function from scipy.special.wofz is the analytic solution for the Voigt profile. Doing the convolution yourself is likely to be a bit slower and is also an approximation (at the machine-precision level).

    Since you are using Lmfit, I would recommend using its VoigtModel which will make your life easier: it uses scipy.special.wofz and parameter names that make it easy to switch to other profiles (say, GaussianModel).

    You did not give a very complete example of code (for reference, a minimal, working version of actual code is more or less expected on SO and highly recommended), but that might look something like

    from lmfit.models import VoigtModel
    
    model = VoigtModel(prefix='p1_') + VoigtModel(prefix='p2_') + ...
    

    As a result of my fit, sometimes I get crazy large standard deviations. For example, these are best-fit parameters for one of the peaks:

       int8:    0.00986265 +/- 0.00113104 (11.47%) (init = 0.05)
       pos8:   -2.57960013 +/- 0.00790640 (0.31%) (init = -2.6)
       gwid8:   0.06613237 +/- 0.02558441 (38.69%) (init = 0.1)
       lwid8:   1.0909e-04 +/- 1.48706395 (1363160.91%) (init = 0.001) 
    

    (intensity, position, gaussian and lorentzian width respectively). Does this output mean that this peak should be purely gaussian?

    First, that may not be a "crazy large" standard deviation - it sort of depends on the data and rest of the fit. Perhaps the value for int8 is really, really small, and heavily overlapped with other peaks -- it might be highly correlated with other variables. But, it may very well mean that the peak is more Gaussian-like.

    Since you are analyzing X-ray scattering data, the use of a Voigt function is probably partially justified with the idea (assertion, assumption, expectation?) that the material response would give a Gaussian profile, while the instrumentation (including X-ray source) would give a Lorentzian broadening. That suggests that the Lorentzian width might be the same for the various peaks, or perhaps parameterized as a simple function of the incident and scattering wavelengths or q values. That is, you might be able to (and may be better off) constrain the values of the Lorentzian width (your lwid, I think, or gamma in lmfit.models.VoigtModel) to all be the same.