rlme4lmertest

Why I get Estimate Std. in negative when the data I am using never can be negative?


I am running a script to find out differences between songs of birds (comparing different lengths, frequencies and others). I am using linear mixed effects with lme4 package. I get as an outcome of negative Estimate Std. and since (for instance) the length of the song can not be negative, I wonder if anybody could tell me what I am doing wrong. Find details underneath.

I have been looking for errors in my data and different ways to dispose of the data, getting the same results.

This is how I have the data organized:


Bird    site    length  freq    
1   FH  2.69    4354    -58.9
1   FH  2.546   4298    -57.3
1   FH  2.043   5303    -53.7
2   FH  4.437   6084    -63.1
11  ML  3.371   4689    -37.1
12  ML  3.706   5470    -39.7
13  ML  4.331   5358    -48.7
13  ML  4.124   4744    -39.8
14  ML  3.802   5805    -42.5

This is the full code


#1 song lenght####

library("lmerTest") 
model1<-lmer(length~site
             +(1|Bird), 
             data=dframe1)

summary(model1)
anova(model1, test="F")

pdat <- expand.grid (site=c("ML", "SI","FH", "SH"))

detach(package:lmerTest) # 
model1<-lmer(length~site
             +(1|Bird), 
             data=dframe1)

pred <- predictSE(model1, newdata = pdat, re.form = NA,
                  se.fit = T, na.action = na.exclude, 
                  type= "response")
pred

predframe <- data.frame (pdat, pred) ; predframe
predframe

plot(
  NULL
  , xlim = c(0.75,4.25)  # 
  , ylim = c(3,6)
  , axes = F  # 
  , ylab = ""
  , xlab = ""
)  
at.x <- c(1,2,3,4)
at.lab <- c(1,2,3,4)

for (i in 1:nrow(predframe))
{arrows( 
  x0 = at.x[i]
  , y0 = (predframe$fit[i] + predframe$se.fit[i])
  , x1 = at.x[i]
  , y1 = (predframe$fit[i] - predframe$se.fit[i])
  , code = 3  
  , angle = 90  
  , length = 0.12  
  , col = "gray25")
  points(
    x = at.x[i]
    , y = predframe$fit[i]
    , pch = 21
    ,bg="black"
    , col = "black"
    , cex = 1.25)  # point size
}


axis(1, labels = c("Mainland","Sully", "Flat Holm","Skokholm"), at = at.lab)  
axis(2, at = c(3,4,5,6), labels = c(3,4,5,6), las = 1, cex.axis = 1)  
box()  
title(xlab = "Location",  line = 2.5, cex = 0.8)  
title(ylab = expression(paste("song length (secs)")), line = 2.75)  

Ahead is the first part of the results, not sure why the site FH (siteFH -0.9480) comes up as negative. This happens with other variables as well, so I guess must be something wrong with the model. I am a beginner, please be considered with me, I've looked already and I haven't found a similar question.

Thank you in advance.


Results
`Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1852 -0.4119 -0.0071  0.5304  2.2659 

Random effects:
 Groups   Name        Variance Std.Dev.
 Bird     (Intercept) 0.51798  0.7197  
 Residual             0.07313  0.2704  
Number of obs: 112, groups:  Bird, 42

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)   4.2429     0.1787 37.6710  23.745  < 2e-16 ***
siteFH       -0.9480     0.2965 36.3879  -3.197 0.002871 ** 
siteSH        1.2641     0.3173 35.4150   3.983 0.000323 ***
siteSI       -0.4258     0.3515 35.2203  -1.212 0.233769    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
       (Intr) siteFH siteSH
siteFH -0.603              
siteSH -0.563  0.339       
siteSI -0.508  0.306  0.286
> anova(model1, test="F")
Type III Analysis of Variance Table with Satterthwaite's method
     Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
site 3.0075  1.0025     3 35.336  13.709 4.337e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1`


Solution

  • The columns in the output are right aligned, so the column is named Estimate, the next column is named Std. Error.

    The estimate describes the association between your dependent and independent variables. It does not describe any values in your dataset.

    A negative estimate just means "the larger your dependent variable (length), the lower your independent variable (site)" (or vice versa). But within this relationship, both variables still can be positive.

    In detail, an estimate of -0.948 in your case means that the length for siteFH is about 0.948 lower than the length for siteML (the reference category, not shown in the output). However, it does not mean that siteFH is negative.