pythonscipycurve-fittingdata-fittinglmfit

Using lmfit Minimizer.minimize() leads to error ValueError: "The truth value of an array with more than one element is ambiguous"


I want to use lmfit to fit the function:

func(x,region,E0,C0,R1,R3,R4,R5,R6,R7,R8,alpha,beta,rho,theta,delta,d)

and get the information on confidence intervals of the parameters. func takes as parameter x an float or an array of floats with values in between 0 and 78. The function calls an exe generated from a cpp-file. And from the outstream of this exe then returns an array of function values corresponding to the x-values.

I also have an array of data called total_data_array that I want to fit to. I am certain that total_data_array contains the correct values and has the correct size. The following is a shortened version of my code:

#Define fitting function
def func(x,region,E0,C0,R1,R3,R4,R5,R6,R7,R8,alpha,beta,rho,theta,delta,d):#x should be between 0 and 75
    print("Function called")
    print(E0,C0,R1,R3,R4,R5,R6,R7,R8,alpha,beta,rho,theta,delta,d)
    #Make x iterable if it is only a float/int
    if not hasattr(x,'__iter__'):
        x = np.array([x])
    result = []
    output = subprocess.check_output([r'...\model.exe', str(region),str(E0),str(C0),str(R1),str(R3),str(R4),str(R5),str(R6),str(R7),str(R8),str(alpha),str(beta),str(rho),str(theta),str(delta),str(d)])
    output_str = codecs.decode(output)
    output_str_list = output_str.split('\r\n')
    output_str_list.pop()
    dataarray = []
    index = 0
    for word in output_str_list:
        if index in range(416,624) or index in range(728,832):
            if word == '-nan(ind)':
                dataarray.append(0.0)
            else:
                dataarray.append(float(word))
        index+=1
    for x0 in x:
        y = dataarray(int(x0*4))
        result.append(y)
    return result

parameters = lmfit.Parameters()
parameters.add('region',value=1)
parameters.add('E0',value=500,min=0.0,max = 10000.0)
parameters.add('C0',value=200,min=0.0,max = 10000.0)
parameters.add('R1',value=0.587,min=0.0,max=1.0)
parameters.add('R3',value=0.3125,min=0.0,max=1.0)
parameters.add('R4',value=0.1666,min=0.0,max=1.0)
parameters.add('R5',value=0.1,min=0.0,max=1.0)
parameters.add('R6',value=0.2,min=0.0,max=1.0)
parameters.add('R7',value=0.4,min=0.0,max=1.0)
parameters.add('R8',value=0.125,min=0.0,max=1.0)
parameters.add('alpha',value=0.09,min=0.0,max=1.0)
parameters.add('beta',value=0.25,min=0.0,max=1.0)
parameters.add('rho',value=0.2,min=0.0,max=1.0)
parameters.add('theta',value=0.26,min=0.0,max=1.0)
parameters.add('delta',value=0.77,min=0.0,max=1.0)
parameters.add('d',value=0.99,min=0.0,max=1.0)
parameters['region'].vary = False

xData = np.arange(0, 78, 1)
#Running this part of the code works
my_model = lmfit.Model(func)
results = my_model.fit(total_data_array,params=parameters,x=xData)

Calling

results = my_model.fit(total_data_array, params=parameters,x=xData)

works perfectly fine. Also the fit does look quite good. But I want to be able to view the confidence intervals of the parameters. As far as I know, there are two ways to achieve this.

  1. Using the model that I have or

  2. Using Minimizers.

If I try the model approach, adding the lines to my code:

    ci = lmfit.conf_interval(my_model,results)
    lmfit.report_ci(ci)

I get the error:

Traceback (most recent call last):
  File "c:\Users\paul1\OneDrive\Desktop\epidemiology\coding\Secihurd_Model\src\python\regional fitting\fitting2.py", line 128, in <module>
    ci = lmfit.conf_interval(my_model,results)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\lmfit\confidence.py", line 141, in conf_interval
    ci = ConfidenceInterval(minimizer, result, p_names, prob_func, sigmas,
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\confidence.py", line 185, in __init__
    raise MinimizerException(CONF_ERR_STDERR)
lmfit.minimizer.MinimizerException: Cannot determine Confidence Intervals without sensible uncertainty estimates

Plotting the fit_reports yields:

[[Model]]
    Model(func)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 511
    # data points      = 78
    # variables        = 15
    chi-square         = 34979.5187
    reduced chi-square = 555.230456
    Akaike info crit   = 506.253115
    Bayesian info crit = 541.603747
    R-squared          = 0.62955613
##  Warning: uncertainties could not be estimated:
[[Variables]]
    region:  1 (fixed)
    E0:      4.04296161 (init = 10)
    C0:      1.26289805 (init = 5)
    R1:      0.63653349 (init = 0.587)
    R3:      0.60694072 (init = 0.3125)
    R4:      0.11924779 (init = 0.1666)
    R5:      2.7998e-07 (init = 0.1)
    R6:      0.58190868 (init = 0.2)
    R7:      0.49973503 (init = 0.4)
    R8:      9.4685e-07 (init = 0.125)
    alpha:   9.7636e-05 (init = 0.09)
    beta:    0.08512620 (init = 0.25)
    rho:     0.55511880 (init = 0.2)
    theta:   0.08153075 (init = 0.26)
    delta:   0.38812246 (init = 0.77)
    d:       1.4235e-07 (init = 0.99)

So all values have changed significantly from the initial values. I read in other posts that the problem might be that some parameters correlate very strongly. To eliminate this possibility, I set for all but two parameters parameters['...'].vary = False, which did not change the error. I did this for different sets of 'two' parameters.

Now, if I instead follow the second approach: I add the lines:

min = lmfit.Minimizer(func,params=parameters,fcn_args=np.arange(0,78, 1))
min.minimize(method='leastsq')
ci = lmfit.conf_interval(min)
lmfit.report_ci(ci)

So now I only minimize my function. I will later subtract the total_data_array to minimize the difference between my function and the data and obtain a curve fit. For now, I just want to get this to work. I get the error:

Traceback (most recent call last):
  File "...\fitting2.py", line 134, in <module>
    min.minimize(method='leastsq')
  File "...\lmfit\minimizer.py", line 2345, in minimize
    return function(**kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "...\lmfit\minimizer.py", line 1651, in leastsq
    lsout = scipy_leastsq(self.__residual, variables, **lskws)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\scipy\optimize\_minpack_py.py", line 415, in leastsq
    shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\scipy\optimize\_minpack_py.py", line 25, in _check_func
    res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:...\lmfit\minimizer.py", line 548, in __residual
    out = self.userfcn(params, *self.userargs, **self.userkws)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: func() takes 17 positional arguments but 79 were given

so already the minimisation seems to fail. Changing fcn_args= to args= as parameter in the minimize() function yields instead the error:

File "...\fitting2.py", line 147, in <module>
    min.minimize(method='leastsq')
  File "...\lmfit\minimizer.py", line 2345, in minimize
    return function(**kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "...\lmfit\minimizer.py", line 1651, in leastsq
    lsout = scipy_leastsq(self.__residual, variables, **lskws)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\scipy\optimize\_minpack_py.py", line 415, in leastsq
    shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\scipy\optimize\_minpack_py.py", line 25, in _check_func
    res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\lmfit\minimizer.py", line 530, in __residual
    if apply_bounds_transformation:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So it might be that the problem lies in some keyword-arguments that need to be passed to minimize(). But I don't know what to pass as keywords or why. I appreciate any help :)


Solution

  • You give two attempts at using lmfit. The first one is correct, and the fit runs. The second, using lmfit.Minimize does not work because your func is a model function not an objective function: it does not have the same call signature, which is what the exceptions you are seeing tell you. But there is no point in trying lmfit.Minimize anyway. The lmfit.Model runs, it just tells you that it cannot estimate uncertainties. Also FWIW, there is no point in running lmfit.conf_interval if the initial fit cannot estimate uncertainties. That will not "fix the problem".

    As you will have read in the lmfit FAQ, there are a few reasons why uncertainties sometimes cannot be estimated, but they all derive from the same root cause: small changes to the value of one or more of the parameters has no noticeable effect on the fit. When that happens, the covariance matrix cannot be determined -- the fit algorithm has decided that at least one of the parameters is not actually affecting the fit. One reason that can happen is that a parameter gets stuck at a boundary or initial value. That does not seem to be the case for your fit. Another possibility is that one or more parameters have reached such extreme values (say, nearly 0) that other parameters no longer have an effect on the fit. Imagine that you are modeling a peak with a center, width, and height. If the height value moves to zero, it doesn't really matter what the width or center value is, that model will just be zero.

    I guess that is what is happening with your fit. The values for several of your parameters go to ~1e-7 level -- 5 or more orders of magnitude smaller than your initial estimate of them.

    I must say that you do not describe your model at all, and are running some external program and then doing some fairly clunky parsing of a text of results. We don't know what any of that is doing. So, there is no way for anyone else to know that if, say, d goes from 0.99 to 1.4-e7, the rest of the model even makes sense. Maybe you know that.
    Anyway, that is where I would start looking. It seems likely that you have not run your model with initial values (or final values) and plotted the results: that is always highly recommended to check that your initial values are reasonable. With such a large value of Chi-square and a value of R so far from 1, it would be easy to just assert that you have a terrible fit. If you don't have a good fit, uncertainties would be meaningless anyway.