I have the following data:
str(growth_data)
tibble [92 × 4] (S3: tbl_df/tbl/data.frame)
$ person: num [1:92] 1 1 1 1 2 2 2 2 3 3 ...
$ gender: chr [1:92] "F" "F" "F" "F" ...
$ growth: num [1:92] 21 20 21.5 23 21 21.5 24 25.5 20.5 24 ...
$ age : Factor w/ 4 levels "8","10","12",..: 1 2 3 4 1 2 3 4 1 2 ...
And from this, using the lme() function in the nlme package, I have created the following model:
# Fitting a mixed model with a random coefficient and unstructured covariance structure.
unstructured_rand <- nlme::lme(growth ~ gender*age,
random = ~ age | person,
data=growth_data,
correlation = corSymm())
I am trying to produce a set of predictions for new age values, not in my data, for persons in my data. Specifically, I want to produce a prediction for person 1 at age 13.
I have tried, in vein, to use the predict() function whilst specifying the newdata argument, like so:
newGrowth <- expand.grid(
person = unique(growth_data$person),
gender = c("F","M"),
age = c(13,15,17,20)
)
newGrowth$Predicted_Response <- predict(unstructured_rand, newdata = newGrowth)
However, I keep running into the following error:
Error in `Names<-.pdMat`(`*tmp*`, value = value[[i]]) :
Length of names should be 4
This seems to be suggesting that my newdata does not have the correct number of columns, but from all other posts on this subject, I have never seen anyone specify a newdata dataframe with the correct number of columns. Further, the only column in my data that is not in the newdata dataframe is growth, which is the variable I am trying to predict.
What am I missing? There seems to be some obvious element from the documentation on lme.predict() that I am failing to apply to my data, but I cannot figure out what it is.
Any help would be much appreciated!
One issue (or maybe the issue at hand) is that you fit a model on data where age was a factor and then tried to predict on data where age was continuous.
Because you did not supply your data, I can't be certain this is the same issue. But the Orthodont data is similar to yours, and this produces an error with the same wording.
library(nlme)
# make some data like yours
orthodont <- Orthodont
orthodont$age <- factor(orthodont$age)
# fit a model similar to yours
fm1 <- lme(distance ~ age, orthodont, random = ~ age | Subject)
# make some new data like your new data
newOrth <- data.frame(Sex = c("Male","Male","Female","Female","Male","Male"),
age = c(15, 20, 10, 12, 2, 4),
Subject = c("M01","M01","F30","F30","M04","M04"))
# attempt prediction and notice same error
predict(fm1, newOrth, level = 0:1)
#> Warning in model.frame.default(formula = asOneFormula(formula(reSt), fixed), :
#> variable 'age' is not a factor
#> Error in `Names<-.pdMat`(`*tmp*`, value = value[[i]]): Length of names should be 4
Fit a model on data with a continuous age variable and use that for prediction. Especially because you are trying to extrapolate past ages for which the model had been fit.
# change factor to numeric to match new data
orthodont$age <- as.numeric(as.character(orthodont$age))
# refit
fm2 <- lme(distance ~ age, orthodont, random = ~ age | Subject)
# attempt prediction again
predict(fm2, newOrth, level = 0:1)
#> Subject predict.fixed predict.Subject
#> 1 M01 26.66389 30.95074
#> 2 M01 29.96481 35.33009
#> 3 F30 23.36296 NA
#> 4 F30 24.68333 NA
#> 5 M04 18.08148 20.95016
#> 6 M04 19.40185 22.13877
Created on 2024-05-03 with reprex v2.1.0