ranovatukey

How do I run a two-way ANOVA that uses type III errors and looks at pairwise comparisons?


I have a dataset with which I would like to compare the effect of species and habitat on homerange size - while using type III errors and pairwise comparisons within species and habitat.
Here's a subset of the data:

species<- c("a","b","c","c","b","c","b","b","a","b","c","c","a","a","b","b","a","a","b","c")
    habitat<-  c("x","x","x","y","y","y","x","x","y","z","y","y","z","z","x","x","y","y","z","z")
    homerange<-c(6,5,7,8,9,4,3,5,6,9,3,6,6,7,8,9,5,6,7,8)
    data1<-data.frame(cbind(species, habitat, homerange))
    data1$homerange<-as.numeric(as.character(data1$homerange))    

Currently I am spltting up the data on the three species, then running separate ANOVAs for each, but I believe it makes more sense to ask about species and habitat at the same time with one ANOVA. Here's an example of the ANOVA I ran for one species:

data.species.a<-subset(data1, species=="a")
fit<-aov(homerange ~ habitat, data=data.species.a)
summary(fit)
TukeyHSD(fit)

aov() appears to use type I errors . . . which I don't think are appropriate; plus I believe Tukey's test may be too conservative an approach for the pairwise comparisons. Can someone help me with an approach that allows me to run one ANOVA that considers both the effect of species and habitat on homerange, with type III errors, that also permits a less-conservative pairwise comparisons of species and habitat?


Solution

  • You can set up Anova in package 'car' to report type III sums of squares and there is an HSD.test in package 'agricolae' that should be able to take that model object as input. I do not think you can legitimately use aov() with your data being unbalanced, so I am doing it with an lm() fit.

    fit<-lm(homerange ~ habitat, data=data.species.a)
    require(car)
     Anova(fit, type="III")
    require(agricolae)
    comparison <- HSD.test(fit, "habitat", group=TRUE)
    

    Note that the SAS default of type-III sums of square is viewed with disdain (and sometimes even outright derision) by the authors of the R base package (read this for more details). The presentation of that method in package 'car' is mainly for purposes of comparison, rather than being a recommendation regarding statistical correctness.

    To add citations to the reasons for being very cautious about accepting the SAS-standard: Frank Harrell's comments re: loss of power and Bill Venables' later comments in the same thread on r-help