When I apply an dbRDA to a distance matrix (in this case the Bray-Curtis distance) like this:
dbrda(sqrt(dist) ~ ., site_vars)
is it ok to include a column of ordered factors in the site_vars
variable, which is a dataframe with values measured at the sampling sites, e.g. mean temperature, but which also includes a column "soil" where different soil types are ordered? Or is it neccessary to add all the ordinal and nominal scaled variables in a separate Condition
argument to the formula?
Here a small example:
data <- rbind(
c(1, 1, 0, 1, 1, 0, 0, 0, 0, 0),
c(1, 1, 1, 0, 1, 1, 0, 0, 0, 0),
c(0, 1, 0, 1, 0, 1, 1, 0, 1, 0),
c(1, 0, 0, 0, 1, 0, 1, 1, 1, 0),
c(0, 0, 0, 1, 0, 0, 0, 0, 1, 1)
)
rownames(data) <- c("Site_1", "Site_2", "Site_3", "Site_4", "Site_5")
colnames(data) <- c("Spec_1", "Spec_2", "Spec_3", "Spec_4", "Spec_5", "Spec_6", "Spec_7", "Spec_8", "Spec_9", "Spec_10")
dist <- vegdist(data, "bray")
site_vars <- data.frame(
Tmean = c(9, 10, 12, 14.5, 14),
SomethingElse = c(12, 14, 13, 16, 21),
Soil = c("good", "good", "OK", "OK", "bad")
)
site_vars$Soil <- ordered(site_vars$Soil, levels = c("good", "OK", "bad"))
# Version 1
dbRDA_Condition <- dbrda(sqrt(dist) ~ Tmean + SomethingElse + Condition(Soil), site_vars)
plot(dbRDA_Condition)
# Version 2
dbRDA <- dbrda(sqrt(dist) ~ Tmean + SomethingElse + Soil, site_vars)
plot(dbRDA)
Version 1 seems to disregard the fact that my soil variable is ranked. Version 2 generates an output I find a bit tricky to interpret, because additional to the group centroids, it also displays arrows. I would expect 1 arrow for soil as if it was a numerical variable with numbers 1, 2 and 3 instead of three levels. However, it shows two arrows, labeled Soil.L and Soil.Q. Why are there two arrows for one variable? And what does *.L and *.Q stand for? Unfortunately, I haven't found any explanation.
R analyses factors using contrasts. In unordered factors the default contrasts are differences to the first factor level. For ordered factors, R uses polynomial contrasts: linear (L
), quadratic (Q
), cubic (C
), fourth-order (^4
). Check any guide to R statistical environment. dbrda
does not invent this feature, but it is the R standard.