rmle

MLE error in R: initial value in 'vmmin' is not finite


Suppose I have 2 data.frame objects:

df1 <- data.frame(x = 1:100)
df1$y <- 20 + 0.3 * df1$x + rnorm(100)
df2 <- data.frame(x = 1:200000)
df2$y <- 20 + 0.3 * df2$x + rnorm(200000)

I want to do MLE. With df1 everything is ok:

LL1 <- function(a, b, mu, sigma) {
    R = dnorm(df1$y - a- b * df1$x, mu, sigma) 
    -sum(log(R))
}
library(stats4)
mle1 <- mle(LL1, start = list(a = 20, b = 0.3,  sigma=0.5),
        fixed = list(mu = 0))

> mle1
Call:
mle(minuslogl = LL1, start = list(a = 20, b = 0.3, sigma = 0.5), 
fixed = list(mu = 0))

Coefficients:
      a           b          mu       sigma 
23.89704180  0.07408898  0.00000000  3.91681382 

But if I would do the same task with df2 I would receive an error:

LL2 <- function(a, b, mu, sigma) {
    R = dnorm(df2$y - a- b * df2$x, mu, sigma) 
    -sum(log(R))
}
mle2 <- mle(LL2, start = list(a = 20, b = 0.3,  sigma=0.5),
              fixed = list(mu = 0))
Error in optim(start, f, method = method, hessian = TRUE, ...) : 
  initial value in 'vmmin' is not finite

How can I overcome it?


Solution

  • The value of R becomes zero at some point; it leads to a non-finite value of the function to be minimized and returns an error.

    Using the argument log=TRUE handles better this issue, see function LL3 below. The following gives some warnings but a result is returned, with parameter estimates close to the true parameters.

    require(stats4)
    set.seed(123)
    e <- rnorm(200000)
    x <- 1:200000
    df3 <- data.frame(x)
    df3$y <- 20 + 0.3 * df3$x + e
    LL3 <- function(a, b, mu, sigma) {
      -sum(dnorm(df3$y - a- b * df3$x, mu, sigma, log=TRUE))
    }
    mle3 <- mle(LL3, start = list(a = 20, b = 0.3,  sigma=0.5),
      fixed = list(mu = 0))
    Warning messages:
    1: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    2: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    3: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    4: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    5: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    6: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    7: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    8: In dnorm(df3$y - a - b * df3$x, mu, sigma, log = TRUE) : NaNs produced
    
    > mle3
    Call:
    mle(minuslogl = LL3, start = list(a = 20, b = 0.3, sigma = 0.5), 
        fixed = list(mu = 0))
    
    Coefficients:
            a         b        mu     sigma 
    19.999166  0.300000  0.000000  1.001803