I would like to ask some questions about lmfit accuracy (and possibly obtain better fit results by obtaining the answer). All experimental spectra are limited by sampling, that is, by the distance between two points in the x-axis direction. I have noticed (so far) two instances when lmfit tries to overcome this limitation, and it is causing me problems:
When FWHM of a peak tends to zero. I assume that if any two neighbor points are separated by around 0.013, then the fit result for the FWHM of 0.00000005 and multimillion percent error don't make much sense. I have solved this problem by putting a proper lower boundary on the FWHM of my peaks. I have also tried fitting some peaks with a Voigt profile, and whenever the Lorentzian width shows this kind of behavior, I convert it into a pure Gaussian. I think it makes no sense to keep it a Voigt in this condition. Is my reasoning correct?
When the position of a peak tends to zero. I believe the reasoning is the same as what I mentioned above, but this time, I don't really know how to limit it "from being too accurate".
Here is the code of the part that is causing actual problems:
import lmfit
from lmfit import Model, Parameters
import matplotlib.pyplot as plt
import numpy as np
x=[-0.3933, -0.38, -0.3667, -0.3533, -0.34, -0.3267, -0.3133, -0.3, -0.2867, -0.2733, -0.26, -0.2467, -0.2333, -0.22, -0.2067, -0.1933, -0.18, -0.1667, -0.1533, -0.14, -0.1267, -0.1133, -0.1, -0.0867, -0.0733, -0.06, -0.0467, -0.0333, -0.02, -0.0067, 0.0067, 0.02, 0.0333, 0.0467, 0.06, 0.0733, 0.0867, 0.1, 0.1133, 0.1267, 0.14, 0.1533, 0.1667, 0.18, 0.1933, 0.2067, 0.22, 0.2333, 0.2467, 0.26, 0.2733, 0.2867]
y=[0.0048, 0.005, 0.0035, 0.0034, 0.0038, 0.004, 0.0034, 0.0036, 0.0038, 0.0046, 0.0038, 0.0039, 0.0054, 0.0065, 0.0073, 0.0086, 0.0079, 0.0102, 0.0105, 0.0141, 0.0192, 0.0259, 0.0275, 0.0279, 0.0257, 0.0247, 0.022, 0.0244, 0.0268, 0.0295, 0.0275, 0.0227, 0.0192, 0.0138, 0.0075, 0.0088, 0.0081, 0.005, 0.0041, 0.0034, 0.0023, 0.0019, 0.0021, 0.0019, 0.0016, 0.0013, 0.0022, 0.002, 0.0019, 0.0014, 0.0022, 0.0012]
def gfunction_norm(x, pos, gfwhm, int):
gwid = gfwhm/(2*np.sqrt(2*np.log(2)));
gauss= (1/(gwid*(np.sqrt(2*np.pi))))*(np.exp((-1.0/2)*((((x-pos)/gwid))**2)))
return int*(gauss-gauss.min())/(gauss.max()-gauss.min())
def final(x, a, b, int2, pos2, gfwhm2, int3, pos3, gfwhm3):
return a*x+b + gfunction_norm(x, pos2, gfwhm2, int2) + gfunction_norm(x, pos3, gfwhm3, int3)
params1=Parameters()
params1.add('a', value=-2.8e-04)
params1.add('b', value=0.003)
params1.add('int2', value=0.04, min=0.01)
params1.add('pos2', value=0, min=-0.05, max=0.05)
params1.add('gfwhm2', value=0.05, min = 0.005, max=0.2)
params1.add('int3', value=0.04, min=0.01)
params1.add('pos3', value=-0.11, min=-0.13, max=-0.06)
params1.add('gfwhm3', value=0.090001, min=0.078, max=0.2)
model1 = Model(final)
result1 = model1.fit(y, params1, x=x)
print(result1.fit_report())
plt.plot(x, y, 'bo', markersize=4)
plt.plot(x, result1.best_fit, 'r-', label='best fit', linewidth=2)
plt.plot(x, gfunction_norm(x, result1.params['pos2'].value, result1.params['gfwhm2'].value, result1.params['int2'].value))
plt.plot(x, gfunction_norm(x, result1.params['pos3'].value, result1.params['gfwhm3'].value, result1.params['int3'].value))
plt.legend()
plt.show()
This is what I obtain as result of the fit:
a: -0.00427895 +/- 0.00102828 (24.03%) (init = -0.00028)
b: 0.00331554 +/- 2.6486e-04 (7.99%) (init = 0.003)
int2: 0.02301220 +/- 9.6324e-04 (4.19%) (init = 0.04)
pos2: 0.00175738 +/- 0.00398305 (226.65%) (init = 0)
gfwhm2: 0.08657191 +/- 0.00708478 (8.18%) (init = 0.05)
int3: 0.02261912 +/- 8.7317e-04 (3.86%) (init = 0.04)
pos3: -0.09568096 +/- 0.00432018 (4.52%) (init = -0.11)
gfwhm3: 0.09304840 +/- 0.00797209 (8.57%) (init = 0.090001)
You can see the huge error next to pos2, and I'm not sure how to fix it.
Thank you!
As values tend to zero, the "percent uncertainty" will increase. That is, if the x-axis were shifted by +1, then your pos2
would have a value of 1.00176 with a standard error of 0.004, and the percent shown would be below 1% -- and the fit would be exactly the same.
You could interpret that as "pos2
is consistent with 0", but it is also true that the estimated standard error is 0.004, whereas the x-spacing of your data is around 0.01. So, yes, the value close to 0 but apparently known to be pretty close to that very small best-fit value.
That is, don't get too concerned about the size of the standard error compared to the best-fit value.