I was tasked with performing a PERMANOVA test on my data, which consists of counts of specific genes found in different types of soils. The goal is to test if different types of lands differs in the number of total genes. I am new to both the concept of PERMANOVA tests and the adonis function.
I have two files, one with the total genes in the sample
head(new_df)
sum
L1 2107.2634619
L10 1916.4122739
L100 1129.1259035
L101 31.1241711
L102 4.3310406
L103 0.6941578
and another which associate the sample with the type of land
head(meta)
land
L1 Woodland
L10 Grassland
L100 Grassland
L101 Grassland
L102 Grassland
L103 Cropland
is my understanding that the adonis function requires a dataframe and a factor in the formula, so I used this command:
results <- adonis2(formula = new_df ~ land, data = meta, permutations = 999)
which gave me this output
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 999
adonis2(formula = new_df ~ LC_simpl_2018, data = meta, permutations = 999)
Df SumOfSqs R2 F Pr(>F)
LC_simpl_2018 3 0.521 0.00261 0.544 0.872
Residual 624 199.245 0.99739
Total 627 199.766 1.00000
My main question is if I correctly implemented the function. Thanks in advance and sorry if is not clear.
No, you didn't correctly specify the test. For starters, sum
is in mydatat
(and that is all that mydatat
contains) so it seems you tried to model sum
using sum
, which makes no sense. If there is more in mydatat
than just sum, then you still over-inflated the variance explained because the same variable exists on both sides of the formula.
More generally, PERMANOVA is not a specific test but a "non parametric" version of a multivariate linear model. It is no more a test than regression is a test. Hence whether you did the "test" correctly or not depends entirely on what hypothesis you were trying to test or what features of the data you were trying to model. You do not provide such information, and regardless if you had, [so] is not the appropriate venue for discussion of statistical models.
If your data are actually a factor plus the sum
variable, then I would just fit this as a Poisson GLM with glm()
. If you want to fit this using PERMANOVA then you have the formula back to front. You want sum ~ group, data = mydatat
where group
is the factor for the soil types. I forget no whether or not adonis2()
can work with a vector response - it really doesn't make sense to compute a dissimilarity from a single variable...