I am receiving the following error when running the mle2()
function from the bbmle
package in R:
some parameters are on the boundary: variance-covariance calculations based on Hessian may be unreliable
I am trying to understand if this is due to a problem with my data or an issue with calling the function properly. Unfortunately, I cannot post my real data, so I am using a similar working example of the same sample size.
The custom dAction
function I am using is a softmax function. There have to be upper and lower bounds on the optimization so I am using the L-BFGS-B method.
library(bbmle)
set.seed(3939)
### Reproducible data
dat1 <- rnorm(30, mean = 3, sd = 1)
dat2 <- rnorm(30, mean = 3, sd = 1)
dat1[c(1:3, 5:14, 19)] <- 0
dat2[c(4, 15:18, 20:22, 24:30)] <- 0
### Data variables
x <- sample(1:12, 30, replace = TRUE)
pe <- dat1
ne <- dat2
### Likelihood
dAction <- function(x, a, b, t, pe, ne, log = FALSE) {
u <- exp(((x - (a * ne) - (b * pe)) / t))
prob <- u / (1 + u)
if(log) return(prob) else return(-sum(log(prob)))
}
### Fit
fit <- mle2(dAction,
start = list(a = 0.1, b = 0.1, t = 0.1),
data = list(x = x, pe = pe, ne = ne),
method = "L-BFGS-B",
lower = c(a = 0.1, b = 0.1, t = 0.1),
upper = c(a = 10, b = 1, t = 10))
Warning message:
In mle2(dAction, start = list(a = 0.1, b = 0.1, t = 0.1), data = list(x = x, :
some parameters are on the boundary: variance-covariance calculations based on Hessian may be unreliable
Here are the results for summary()
:
summary(fit)
Maximum likelihood estimation
Call:
mle2(minuslogl = dAction, start = list(a = 0.1, b = 0.1, t = 0.1),
method = "L-BFGS-B", data = list(x = x, pe = pe, ne = ne),
lower = c(a = 0.1, b = 0.1, t = 0.1), upper = c(a = 10, b = 1,
t = 10))
Coefficients:
Estimate Std. Error z value Pr(z)
a 0.1 NA NA NA
b 0.1 NA NA NA
t 0.1 NA NA NA
-2 log L: 0.002048047
Warning message:
In sqrt(diag(object@vcov)) : NaNs produced
And the results for the confidence intervals
confint(fit)
Profiling...
2.5 % 97.5 %
a NA 1.0465358
b NA 0.5258828
t NA 1.1013322
Warning messages:
1: In sqrt(diag(object@vcov)) : NaNs produced
2: In .local(fitted, ...) :
Non-positive-definite Hessian, attempting initial std err estimate from diagonals
I don't entirely understand the context of your problem, but:
The issue (whether it is a real problem or not depends very much on the aforementioned context that I don't understand) has to do with your constraints. If we do the fit without the constraints:
### Fit
fit <- mle2(dAction,
start = list(a = 0.1, b = 0.1, t = 0.1),
data = list(x = x, pe = pe, ne = ne))
## method = "L-BFGS-B",
## lower = c(a = 0.1, b = 0.1, t = 0.1),
## upper = c(a = 10, b = 1, t = 10))
we get coefficients that are below your bounds.
coef(fit)
a b t
0.09629301 0.07724332 0.02405173
If this is correct, at least one of the constraints is going to be active (i.e. when we fit with lower bounds, at least one of our parameters will hit the bounds - in fact, it's all of them). When fits are on the boundary, the simplest machinery for computing confidence intervals (Wald intervals) doesn't work. However, this doesn't affect the profile confidence interval estimates you report above. These are correct - the lower bounds are reported as NA
because the lower confidence limit is at the boundary (you can replace these by 0.1 if you like).
If you didn't expect the optimal fit to be on the boundary, then I don't know what's going on, maybe a data issue.
Your log-likelihood function is not wrong, but it's a little confusing because you have a log
argument that returns the negative log-likelihood when log=FALSE
(default) and the likelihood when log=TRUE
. Before I realized that, I rewrote the function (I also made it a little more numerically stable by doing computations on the log scale wherever possible).
dAction <- function(x, a, b, t, pe, ne) {
logu <- (x - (a * ne) - (b * pe)) / t
lprob <- logu - log1p(exp(logu))
return(-sum(lprob))
}