rminimizationmle

R: Maximum Likelihood Estimation of a exponential mixture using optim


I'm trying to get the parameters w, lambda_1, lambda_2 and p from a mixture bi-exponential model, using a loglikelihood function and the optim function in R. The model is the following

bi-exponential mixture

Here is the code

biexpLL <- function(theta, y) {
  # define parameters
  w <- theta[1]
  lambda_1 <- theta[2]
  a <- theta[3]
  lambda_2 <- theta[4]
  # likelihood function with dexp
  l <- w * dexp((y - a), rate = 1/lambda_1) + (1 - w) * dexp((y - a), rate = 1/lambda_2)
  
  - sum(log(l))
}
# Generate some fake data
w <- 0.7
n <- 500
lambda_1 <- 2
lambda_2 <- 0.2
set.seed(45)
biexp_data <- (w * rexp(n, 1/lambda_1) + (1 - w) * rexp(n, 1/lambda_2)) 
# Optimization
optim(par = c(0.5,0.1,0.001,0.2),
      fn=biexpLL,
      y=biexp_data)
#$par
#[1] -94789220.4     16582.9   -333331.7 134744336.2

The parameters are very different from the used in the fake data! What I'm doing wrong?


Solution

  • The original code is prone to warnings and errors since the parameters may go to invalid values easily. For example, we need w in [0, 1] and lambda > 0. Also, if a is larger than a data point, then the density becomes zero, hence infinite log likelihood.

    The code below uses some tricks to handle these cases.

    Also, the data generation process has been changed so that samples are generated from one of the exponential distributions with the given probability w.

    Finally, increased the sample size since the result was not stable with n=500.

    biexpLL <- function(theta, y) {
      # define parameters
      w <- 1/(1+exp(-theta[1]))
      lambda_1 <- exp(theta[2])
      a <- theta[3]
      lambda_2 <- exp(theta[4])
      # likelihood function with dexp
      l <- w * dexp((y - a), rate = 1/lambda_1) + (1 - w) * dexp((y - a), rate = 1/lambda_2)
      - sum(log(l + 1e-9))
    }
    # Generate some fake data
    w <- 0.7
    n <- 5000
    lambda_1 <- 2
    lambda_2 <- 0.2
    set.seed(45)
    n1 <- round(n*w)
    n2 <- n - n1
    biexp_data <- c(rexp(n1, rate=1/lambda_1),
                    rexp(n2, rate=1/lambda_2)) 
    # Optimization
    o <- optim(par=c(0.5,0.1,0.001,0.2),
               fn=biexpLL,
               y=biexp_data)
    
    1/(1+exp(-o$par[1]))
    exp(o$par[2])
    o$par[3]
    exp(o$par[4])
    

    On my environment I obtained the below.
    The result seems reasonably close to the simulation parameters (note that two lambda values are swapped).

    > 1/(1+exp(-o$par[1]))
    [1] 0.3458264
    > exp(o$par[2])
    [1] 0.1877655
    > o$par[3]
    [1] 3.738172e-05
    > exp(o$par[4])
    [1] 2.231844
    

    Notice that for mixture models of this kind, people often use the EM algorithm for optimizing the likelihood instead of the direct optimization as this. You may want to have a look at it as well.