rstatisticsnlssingular

Singular gradient in nls model - how can I make it work?


I have data, on which I want to fit a non-linear regression model. The model is a physical model to compute the Chlorid defusion coefficient. In my case the model looks like

Cx = Ci + (Cs - Ci) * erfc(x / (sqrt(4 * D * t))

with Ci = 0.020664, t = 28/365, x and Cx being in the data and Cs and D are the coefficients to be computed. Erfc is the complementary error function.

I have data in form of

data = data.frame(x=c(2.13, 4.38, 6.38, 8.13, 10.38, 13.88, 17.38), 
                  Cx=c(0.085, 0.017, 0.011, 0.010, 0.009, 0.010, 0.009)) 

So what I coded in R was

erf <- function(x) 2 * pnorm(x * sqrt(2)) - 1 # error function
erfc <- function(x) 1 - erf(x) # complementary error function

m1 <- nls(formula = Cx ~ 0.020664 + (Cs - 0.020664) *
            erfc(x / (sqrt(4 * D * (28/365)))), 
          data = data,
          start = list(Cs = 0.5, D = 50))

Which gives me the error message "singular gradient". Since the data is already given and I can't really change the model either, has somebody an idea how to solve this?
(I saw that often times it is recommended to use a different library than nls when this error occurs, but these (i.e nlsr) couldn't derive the erfc function.)


Solution

  • Since erfc > 0 if Cs - 0.020664 is positive then the second term is positive so the entire RHS will be above 0.020664 whereas all the points except the first are below it. Also if Cs - 0.020664 is negative then all points will be below 0.020664 in which case it will be far from the first point.

    Thus it is not a matter of finding a different algorithm -- your model simply does not fit the data.

    Also as a general comment simply looking for a different algorithm when nls fails is often a poor strategy since the situation we have here is often the case. A better strategy is to understand the model better and improve it.

    If we modify the model slightly as a linear combination of 1 and erfc then we can get a good fit. The linear combination coefficients are .lin.A and .lin.B and do not need starting values when using the plinear algorithm. The algorithm converges in only 3 iterations which together with the plot below shows it fits the data well. Note that this revised model still has parameter D so if that is the main parameter of interest you can still use this model.

    In the plot below the circles are the data points and the line is the fitted values at them.

    fm <- nls(formula = Cx ~ cbind(A = 1, B = erfc(x / sqrt(4 * D * 28/365))), 
          data = data, start = list(D = 25), algorithm = "plinear")
    
    fm
    ## Nonlinear regression model
    ##   model: Cx ~ cbind(A = 1, B = erfc(x/sqrt(4 * D * 28/365)))
    ##    data: data
    ##         D    .lin.A    .lin.B 
    ## 25.932426  0.009704  0.263631 
    ##  residual sum-of-squares: 2.041e-06
    ##
    ## Number of iterations to convergence: 3 
    ## Achieved convergence tolerance: 6.247e-06
    
    plot(Cx ~ x, data)
    lines(fitted(fm) ~ x, data)
    

    screen