I have a longitudinal dataset for plant growth recorded in different seasons. I fitted the data to growth models with/without seasonal effect using nlme
function from nlme
package.
To see the seasonal effect on the estimated parameter (k), I compared the two models using anova
function, however, I got "Error in getResponseFormula(el) : 'form' must be a two-sided formula" and stuck in this problem.
Can anyone help me solve this problem?
Here is the reproducible example.
library(tidyverse)
# generate dummy data
n = 100
id = as.factor(1:n)
season = c("s1","s2","s3","s4")
L0 = rnorm(n,40,5)
td = rnorm(n, 180, 10)
growth <- function(td, Lt_1){
return(function(k,p){
150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
})
}
L1 = growth(Lt_1 = L0, td = td)(p = 1.2, k = 0.55)
L2 = growth(Lt_1 = L1, td = td)(p = 1.2, k = 1.09)
L3 = growth(Lt_1 = L2, td = td)(p = 1.2, k = 0.62)
L4 = growth(Lt_1 = L3, td = td)(p = 1.2, k = 0.97)
df <- data.frame(id = id, L0 = L0, L1 = L1, L2 = L2, L3 = L3, L4 = L4) %>%
pivot_longer(cols = L0:L4, names_to = "Ltype", values_to = "Lobs") %>%
mutate(Lobs = round(Lobs)) %>%
group_by(id) %>%
mutate(Lt_1 = lag(Lobs)) %>%
filter(!is.na(Lt_1)) %>%
mutate(td = round(rnorm(4, 180, 10))) %>%
mutate(season = factor(season)) %>%
ungroup() %>%
mutate(season2 = case_when(
season == "s3" ~ "s1",
season == "s4" ~ "s2",
TRUE ~ season
)) %>%
mutate(season2 = factor(season2)) %>%
mutate(log_Lobs = log(Lobs))
> head(df)
# A tibble: 6 x 8
id Ltype Lobs Lt_1 td season season2 log_Lobs
<fct> <chr> <dbl> <dbl> <dbl> <fct> <fct> <dbl>
1 1 L1 57 48 178 s1 s1 4.04
2 1 L2 76 57 184 s2 s2 4.33
3 1 L3 87 76 188 s3 s1 4.47
4 1 L4 103 87 194 s4 s2 4.63
5 2 L1 43 35 175 s1 s1 3.76
6 2 L2 63 43 176 s2 s2 4.14
library(nlme)
## model without seasonal effect
formula = Lobs ~ 150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
p = 1.2
k = 0.4
fit_null <- nlme(formula,
fixed = c(p ~ 1, k ~ 1),
random = p ~ 1 | id,
data = df,
start = list(fixed = c(p, k)),
na.action = na.exclude,
control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE))
## model with seasonal effect
p = 1.2
k = c(0.4, 1.09)
fit_withSeason <- nlme(formula,
fixed = c(p ~ 1, k ~ 1 + season2),
random = k ~ 1 | id,
data = df,
start = list(fixed = c(p, k)),
na.action = na.exclude,
control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE))
> summary(fit_null)
Nonlinear mixed-effects model fit by maximum likelihood
Model: formula
Data: df
AIC BIC logLik
2312.975 2328.941 -1152.488
Random effects:
Formula: p ~ 1 | id
p Residual
StdDev: 8.300513e-05 4.315789
Fixed effects: c(p ~ 1, k ~ 1)
Value Std.Error DF t-value p-value
p 0.7335954 0.09075133 299 8.083577 0
k 0.9823510 0.05594228 299 17.560080 0
Correlation:
p
k -0.969
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.4898414 -0.8966244 -0.2154542 0.8401292 2.2148509
Number of Observations: 400
Number of Groups: 100
> summary(fit_withSeason)
Nonlinear mixed-effects model fit by maximum likelihood
Model: formula
Data: df
AIC BIC logLik
1466.922 1486.879 -728.4608
Random effects:
Formula: k ~ 1 | id
k.(Intercept) Residual
StdDev: 0.03294767 1.37547
Fixed effects: c(p ~ 1, k ~ 1 + season2)
Value Std.Error DF t-value p-value
p 1.9509211 0.16984927 298 11.48619 0
k.(Intercept) 0.5212180 0.01156849 298 45.05498 0
k.season2s2 0.4151107 0.00828721 298 50.09055 0
Correlation:
p k.(In)
k.(Intercept) -0.868
k.season2s2 -0.572 0.265
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.341255091 -0.727512901 0.003841993 0.622830078 3.190926108
Number of Observations: 400
Number of Groups: 100
# model comparison to test the seasonal effect
> anova(fit_null, fit_withSeason)
Error in getResponseFormula(el) : 'form' must be a two-sided formula
This is arguably a bug in nlme
, but a tricky one. When you pass the formula as a symbol called "formula" (i.e. "formula", rather than "Lobs ~ "), it fails to extract the formula from the model call properly.
If you change the variable name of your formula from formula
to (e.g.) form1
, I think everything should work fine (another reason not to use names of built-in objects as variable names!)
More generally (for other issues involving improper evaluation of objects stored as symbols) you can work around this by using do.call()
, which tricks R into evaluating the formula before storing the call:
fit_null <- do.call(nlme,
list(formula,
fixed = c(p ~ 1, k ~ 1),
random = p ~ 1 | id,
data = df,
start = list(fixed = c(p, k)),
na.action = na.exclude,
control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE)
))
## model with seasonal effect
p = 1.2
k = c(0.4, 1.09)
fit_withSeason <- do.call(nlme,
list(formula,
fixed = c(p ~ 1, k ~ 1 + season2),
random = k ~ 1 | id,
data = df,
start = list(fixed = c(p, k)),
na.action = na.exclude,
control=list(maxIter=1e6, msMaxIter = 1e6, msVerbose = TRUE)))
anova(fit_null, fit_withSeason)
Model df AIC BIC logLik Test L.Ratio p-value
fit_null 1 4 2317.913 2333.879 -1154.9565
fit_withSeason 2 5 1540.088 1560.045 -765.0438 1 vs 2 779.8253 <.0001
I've submitted this as an R bug