rvariancedimensionality-reductionmdsdimension-reduction

How to compute/extract the residual variance from an Isomap [vegan] model in R


I am currently trying to understand how Isomap results will differ from PCA and MDS and if they are more suited for my data. For this I started to work with the isomap function provided by vegan in R using the BCI dataset and their basic example https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/isomap (code below). Several publications compare the residual variance as a good measure (e.g. the "original paper by Tenenbaum from 2002, pg. 2321) https://web.mit.edu/cocosci/Papers/sci_reprint.pdf However, so far I have failed to extract this information from the object "ord" in the example. There is this element ord[["eig"]], probably connected to it, but so far I am confused. Help would be much appreciated!

> data(BCI)
dis <- vegdist(BCI)
tr <- spantree(dis)
pl <- ordiplot(cmdscale(dis), main="cmdscale")
lines(tr, pl, col="red")
ord <- isomap(dis, k=3)
ord 

plot(ord[["eig"]])  # plot of the eig values, index represents sample number (?)


Solution

  • So I did some further investigation on this topic.

    Essentially there will be as many Eigenvalues in the dataset as variables. The Eigenvals will be covered in the new components or dimensions according to their explanatory power, the first component or dimension will usually explain most i.e. have the largest Eigenvalue. Eigenvalues of 1 explain just one variable, which is pretty boring. Mathematically, Eigenvalues are the sum of squared factor loadings.

    For Isomap from the example above ,this can be as follows:

    gram<-(ord[["eig"]]) # extracts the  values of the gram matrix aka Eigenvalues
    sum(gram) # close to ~55 (number of variables) the Isomap procedure can alter the maximum variance in the dataset by compression and expansion unlike Principal Component Analysis
    
    variance_covered <-0
    
    for (i in gram) {
      print(variance_covered)
      variance_covered<-variance_covered+(i/sum(gram))  # this prints the amount of variance covered per dimension
    }
    

    The following PCA with [prcomp] gives a more straightforward example

    data(mtcars)
    head(mtcars)
    
    cars.autoscale <- prcomp(mtcars,
                 center = TRUE,
                 scale. = TRUE) 
    
    pca_factorload<-summary(cars.autoscale)[1] 
    factors<-unlist(pca_factorload, use.names=FALSE) # extracts the factor loadings aka. standard deviation
    eigenvals <-factors^2   #squared factor loadings = Eigenvalues
    total_eigenvals<-sum(eigenvals) #This sum is 11 which is the number of variables in mtcars
    
    var_sum <-0
    
    for (i in eigenvals) {
       print(var_sum)
       var_sum<-var_sum+(i/sum(eigenvals)) ))  # this prints the amount of variance covered per component
    
     }