I'm a new student in a bioinformatics lab, please feel free to correct me if anything is wrong.
I have made a CCA using the vegan package in R with the following script:
cca.analysis <- cca(mod ~ genus1 + genus2 + genus3, data)
I'm currently attempting to measure the scores/ contribution of each variable (genus) so I can determine which one was most influential to community variation in my dataset. I have two issues:
Edit: I have made a reproducible example, to help give some insight about the question. Here is the genus data:
║ genus_1 ║ genus_2 ║ genus_3 ║
║ 15.635 ║ 10.293 ║ 0 ║
║ 9.7813 ║ 9.0061 ║ 5.4298 ║
║ 15.896 ║ 2.5612 ║ 3.4335 ║
║ 4.0054 ║ 0 ║ 2.0043 ║
║ 15.929 ║ 16.213 ║ 0 ║
║ 11.072 ║ 15.434 ║ 0 ║
║ 12.539 ║ 7.2498 ║ 0 ║
║ 9.1164 ║ 11.526 ║ 2.1649 ║
║ 4.5011 ║ 0 ║ 0 ║
║ 11.66 ║ 13.46 ║ 5.1416 ║
The mod
part in the formula I provided corresponds to the following data, which I extracted from a PCoA analysis:
║ Coord_1 ║ Coord_2 ║ Coord_3 ║ Coord_4 ║ Coord_5 ║ Coord_6 ║ Coord_7 ║
║ 0.954 ║ 0.928 ║ 0.952 ║ 1.009 ║ 1.016 ║ 0.943 ║ 1.031 ║
║ 0.942 ║ 1.088 ║ 1.100 ║ 1.015 ║ 1.080 ║ 1.140 ║ 1.002 ║
║ 0.932 ║ 0.989 ║ 1.005 ║ 0.974 ║ 0.990 ║ 1.047 ║ 1.035 ║
║ 0.929 ║ 1.111 ║ 1.094 ║ 0.847 ║ 0.932 ║ 0.940 ║ 1.016 ║
║ 0.947 ║ 1.008 ║ 0.937 ║ 1.055 ║ 1.056 ║ 0.964 ║ 1.022 ║
║ 0.948 ║ 1.054 ║ 0.987 ║ 1.018 ║ 1.017 ║ 0.965 ║ 0.994 ║
║ 0.946 ║ 1.023 ║ 0.911 ║ 1.014 ║ 1.062 ║ 1.076 ║ 1.063 ║
║ 1.041 ║ 1.000 ║ 0.945 ║ 0.872 ║ 1.036 ║ 0.907 ║ 1.029 ║
║ 0.926 ║ 1.107 ║ 1.027 ║ 0.943 ║ 0.993 ║ 1.006 ║ 0.947 ║
║ 1.038 ║ 1.016 ║ 1.008 ║ 1.013 ║ 0.997 ║ 0.891 ║ 0.988 ║
You can plot this in R with function plot
and this is hopefully get something like this:
CCA plot
Actually, the scaling of the constraining variables (genus1
etc) does not influence their contributions to the model. You can verify this by multiplying one of your constraints with some number (say 10) and comparing the resulting models and seeing that they do not change. What will change are the regression coefficients for constraints, but they are of no interest here (regression coefficient will change to cancel the effect of multiplication).
The key point is: what do you mean with "contribution"? If you mean how much each of these constraints "explains" of the total variation in the data, you can get this information from anova(cca.analysis, by = "terms")
or alternatively from anova(cca.analysis, by = "margin")
. The first analysis will be sequential decomposition of explained variation where the components add up to 100% of explained, and the latter decomposition to unique terms where the components do not add up to 100%. Up to three components (genus), you can also use varpart
function (for cca
with argument chisquare = TRUE
: for this you need the latest vegan release) which decomposes the total explained variation into unique and joint contributions.
If you mean something else with "contribution", please explain.