pythonnumpymathscipybest-fit-curve

Scipy.optimize not fiting to my data


I cannot get scipy.optimize.curve_fit to properly fit my data which is visually apparent. I know approximately what the parameter values should be and if I evaluate the function with the given parameters the calculated and experimental data appear to agree well:

calculated and experimental data

However, when I use scipy.optimize.curve_fit the output parameters with the smallest error is a much worse fit (by visual inspection). If I use the "known" parameters as my initial guess and bound the parameters to a relatively narrow window as shown in the example of output from fit function:

example of output from fit function

I obtain error values ~10^2 times larger but the visual appearance of the fit seems better. The only way I can get a decent looking fit for the data is to bound all the parameters with ~ 0.3 units of the "known" parameter. I plan on using this code to fit more complex data that I will not know the parameters before hand, so I cannot just use the calculated plot.

The relevant code is included below:

import matplotlib.pyplot as plt
import numpy as np
import scipy
from scipy.optimize import curve_fit
d_1= 2.72 #Anstroms
sig_cp_1= 0.44
sig_int_1= 1.03
d_1, sig_cp_1,sig_int_1=float(d_1),float(sig_cp_1),float(sig_int_1)
Ref=[]
Qz_F=[]
Ref_F=[]
g=open("Exp_Fresnal.csv",'rb')#"Test_Fresnal.csv", 'rb')
reader=csv.reader(g)
for row in reader:
    Qz_F.append(row[0])
    Ref.append(row[1])
    Ref_F.append(row[2])
Ref=map(lambda a:float(a),Ref)
Ref_F=map(lambda a:float(a),Ref_F)
Qz_F=map(lambda a:float(a),Qz_F)
Ref_F_arr=np.array((Ref_F))
Qz_arr=np.array((Qz_F))
x=np.array((Qz_arr,Ref_F))
def func(x,d,sig_int,sig_cp):
     return (x[1])*(abs(x[0]*d*(np.exp((-sig_int**2)*(x[0]**2)/2)/(1-np.exp(complex(0,1)*x[0]*d)*np.exp((-sig_cp**2)*(x[0]**2)/2)))))**2
DC_ref=func(x,d_1,sig_int_1,sig_cp_1)
Y=np.array((Ref))
popt, pcov=curve_fit(func,x,Y,)#p0=[2.72,1.0,0.44])
perr=np.sqrt(np.diag(pcov))
print "par=",popt;print"Var=",perr
Fit=func(x,*popt)
Fit=func(x,*popt)
Ref=np.transpose(np.array([Ref]))
Qz_F=np.transpose(Qz_F)

plt.plot(Qz_F, Ref, 'bs',label='Experimental')
plt.plot(Qz_F, Fit, 'r--',label='Fit w/ DCM model')
plt.axis([0,3,10**(-10),100])
plt.yscale('log')
plt.title('Reflectivity',fontweight='bold',fontsize=15)
plt.ylabel('Reflectivity',fontsize=15)
plt.xlabel('qz /A^-1',fontsize=15)
plt.legend(loc='upper right',numpoints=1) 
plt.show()

The arrays are imported from a file (which I cannot include) and there are no outlier points that would cause the fit to become this distorted. Any help is appreciated.

Edit I included additional code and the input data to go along with the code but you will have to re-save it as a MS-Dos .CSV


Solution

  • @WarrenWeckesser has a really good point, but further note that the y axis is logarithmic. That apparently huge error at the right end is something like 1e-5 in magnitude, while the points on the top left have reflectivity values of around 0.1. The square error coming from the tail is simply insignificant compared to the huge terms on the left.

    I'm sure curve_fit works great. If you want a better visual fit, I suggest trying a fit to log(y) with the log() of your model (either that, or weight your points during fitting); then the result might be more stable visually (and from a physical point of view). Since you're probably trying to give an overall broad-spectrum description of your system, this might be closer to what you expect (but this will inevitably lead to a less precise fit where the reflectivity is high).