I am trying to implement masked-array fitting using lmfit
.
Prior I was using curve_fit
using the following code snippet. I have omitted the def spec
due to its line length. Essentially def spec
is function that performs 50 interpolations of templates.
Both the observation data and templates have the file structure x=wavelength
, y=flux
from x=500
to x=550nm
.
The observation data is a mixtures of two spectra with each spectrum have different parameters.
The file lambda_input
is a list of wavelengths (example: list of wavelengths[1]) to be used in the fitting process. The fit is only to be carried out in these wavelength ranges (+/- line-widths (lw
))
Using sigma=weight
works but now I need to use lmfit
I am struggling to find a method that works.
How can I mask both the observation and templates wavelengths to input_lambda
in order to use lmfit
?
Thank you
import pandas as pd
import numpy as np
from scipy import interpolate
from scipy.optimize import curve_fit
import lmfit
i_run = pd.read_csv("run.txt",delimiter='\t')
i_run.columns = ["wave","flux"]
run_w = np.array(i_run['wave'])
run_f = np.array(i_run['flux']) # observation data
lambda_input = pd.read_excel("../lambda/weights.xlsx") #fitting wavelengths
v1 = 5 # conversion factors
v2 = 6
lw = 0.01 # linewidths (+/- fitting wavelength)
weight = np.zeros(len(run_w))
for w in range (0,187):
n = lambda_input.iat[w,1]
weight += np.select([np.logical_and(run_w >= ((n *(1 + (v1)))-lw),run_w <= ((n *(1 + (v1)))+lw)),run_w>0],[1,0])
for w in range (0,187):
n = lambda_input.iat[w,1]
weight += np.select([np.logical_and(run_w >= ((n *(1 + (v2)))-lw),run_w <= ((n *(1 + (v2)))+lw)),run_w>0],[1,0])
weight = np.select([weight>0,weight==0],[1,1e6])
p1 = 750
p2 = 800
p3 = 1.0
p4 = 20.0
p5 = 30.0
p6 = 0.5
p7 = 0.5
p8 = 100
p9 = -100
p10 = 4.0
p11 = 4.0
p12 = 3.0
p13 = 3.0
def spec(run_w,t1,t2,r,v1,v2,m1,m2,r1,r2,l1,l2,mi1,mi2):
return spectrum
popt, pcov = curve_fit(spec, xdata=run_w,sigma=weight,ydata=run_f, p0=[p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13])
model = lmfit.Model(spec) #fitting: lmfit
p = model.make_params(t1=p1,t2=p2,r=p3,v1=p4,v2=p5,m1=p6,m2=p7,r1=p8,r2=p9,l1=p10,l2=p11,mi1=p12,mi2=p13)
fit = model.fit(data=run_f, params=p, run_w=run_w, method='nelder', nan_policy='omit')
lmfit.report_fit(fit)
fit.plot()
[1]: https://i.sstatic.net/9wAb2.png
With lmfit.Model
, weighting the elements of the data array is done with the weights
to the lmfit.Model.fit()
method, to modify the quantity to be minimized in the least-squares sense from data-model
to weights*(data-model)
. See https://lmfit.github.io/lmfit-py/model.html#lmfit.model.ModelResult.fit
Ironically, you have a weight
array, but use this as sigma
in curve_fit. To use with
lmfit.Model.fit(), you would want to use
1.0/weight, or modify your
weight` to something like:
weight = np.select([weight>0,weight==0],[1,1.e6])
and then use with
fit = model.fit(data=run_f, params=p, run_w=run_w,
weights=weight, method='nelder', nan_policy='omit')
lmfit.report_fit(fit)
I would make a couple more suggestions:
method='nelder'
as the fitting method unless you know
you need that (and if, then ask yourself why you didn't use that with
curve_fit
). The default fitting method is highly recommended, especially to start.nan_policy
. If your model generates NaNs or Infs, nan_policy
won't help at all - those would have to be checked for and resolved.lmfit.Model
is that parameters are named so as to be meaningful to the person running and interpreting the results. Having 13 parameters with names like t1
, t2,
v1,
v2,
m1,
m2`, etc is going to be hard to interpret in the future. Names like "time1", "theta1", "velo1", "mass1", "mean1", etc will be much clearer.