rstatisticsanovamanova

Growth curve analysis in R - comparison of two growth curves


I have a small task to do, unfortunately I'm not familiar with this field of statistics... Actually I did needed calculations (I'm not looking for ready solution), however I don't know if they're correct and also and my way of thinking, hence I'll be very grateful if you take a look and point my mistakes.

Here is fake data, presenting growth rate of dogs and cats (totally made up):

time <- c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
a <- rep('dog', 10)
b <- rep('cat', 10)
animal <- c(a,b)
val <- c(2.00,8.00,17.00,21.00,29.00,37.00,41.00,56.00,67.00,82.00,1.00,3.00,6.00,8.00,11.00,15.00,21.00,26.00,31.00,37.00)
data <- data.frame(time,animal,val)

Take a closer look:

require(ggplot2)
ggplot(data, aes(time, val, color=animal)) +
  stat_summary(fun.data=mean_se, geom="pointrange") +
  geom_point()

enter image description here

As you see dogs grow faster than cats - it might be my hypothesis. However I need to do some statistics to conform it.

So I decided to perform Growth Curve Analysis (GCA). I was based on this tutorial. Below my results with brief explanation.

So first of all I made a base model, random intercept for each animal:

m.base <- lmer(val ~ time + (1 | animal), data=data, REML = F)

And here I have the problem, actually I don't have any fixed effects here, my dataset is simple, all I want to know is that the growth rate in time in my both groups (dogs and cats) differs statistically significant. In other words. Did the animals differ in their growth rate during this period of time?

Therefore I put my animals as a additional fixed effect:

m.1 <- lmer(val ~ time * animal + (1 | animal), data=data, REML = F)

Now, to check is there statistically significant difference I compared both models using ANOVA.

    > anova(m.base,m.1)
Data: data
Models:
m.base: val ~ time + (1 | animal)
m.1: val ~ time * animal + (1 | animal)
       Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
m.base  4 151.43 155.41 -71.714   143.43                             
m.1     6 116.29 122.26 -52.145   104.29 39.138      2  3.171e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now I'm confused and I don't understand completely all of these analysis, several questions...

This value 3.171e-09 indicates that growth rate of my groups differs statistically significant?

Shall I made another model:

m.0 <- lmer(val ~ time + animal + (1 | animal), data=data, REML = F)

and then perform model testing?

> anova(m.base,m.0,m.1)
Data: data
Models:
m.base: val ~ time + (1 | animal)
m.0: val ~ time + animal + (1 | animal)
m.1: val ~ time * animal + (1 | animal)
       Df    AIC    BIC  logLik deviance   Chisq Chi Df Pr(>Chisq)    
m.base  4 151.43 155.41 -71.714   143.43                              
m.0     5 145.58 150.56 -67.789   135.58  7.8499      1   0.005082 ** 
m.1     6 116.29 122.26 -52.145   104.29 31.2884      1  2.224e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Which value allows me to confirm my hypothesis?

I will be very grateful for ponint my mistakes, any clues and explanation!


Solution

  • I assume that each data point is from a different animal. If you only have data from two animals, you can only compare these two animals and can't infer anything about the two populations. If you have data from several animals but each animal was measured repeatedly, you would indeed need a mixed effects model. But with my assumption above, you don't need it.

    You could now use a parametric model from domain-specific theory and use nlme::gnls. That function basically fits a non-linear model where the parameters are linear models of some other variables (in your case the type of animal). The parameters of these linear models can then be tested for significance, which the summary method does for you. If you have repeated measures, nlme::nlme extends this to a mixed effects model.

    Another approach is a non-parametric model:

    library(mgcv)
    mod1 <- gam(val ~ s(time, k = 4), data = data, select = TRUE)
    mod2 <- gam(val ~ animal + s(time, k = 4, by = animal), data = data, select = TRUE)
    #we need the parametric effect because smoothers are centered
    
    #compare both models, not sure which test is more appropriate, 
    #let's just do both Chisq and F
    anova(mod1, mod2, test = "Chisq")
    anova(mod1, mod2, test = "F")
    #significant difference between animal types
    #plots show which one grows faster
    gam.check(mod2)
    plot(mod2)
    summary(mod2)
    

    This can also be extended to a mixed effects model if necessary.