rggplot2r-corrplotggcorrplot

Calculating non-pairwise correlation between different sets of columns of a dataset


Suppose to calculate correlation of dataset mtcars we can use corr <- round(cor(mtcars), 1), or calculate and plot a correlation matrix using corrplot(cor(mtcars). However, the product is pairwise correlation. Is it possible to calculate non-pairwise correlation between spesific set of columns i.e., in mtcars correlation between columns 1:3 (mpg, cyl, disp) VS column 4:8 (hp, drat, wt, qsec, vs).

Please suggest any method.

I have a data set containing 120 columns. I want to calculate/plot correlation matrix (Spearman correlate) of columns 1:50 with rest of the columns (i.e., 51-120). Please tell me the solution, if possible, by using cor, or packages i.e., corrplot, ggcorrplot.

Thanks in anticipation.


Solution

  • You can do this using the cor function, just need to specify the second parameter. Example:

    vertical_axis <- 1:3
    horizontal_axis <- 4:8
    corr <- cor(mtcars[, vertical_axis], mtcars[, horizontal_axis])
    corrplot(corr)
    

    The result:

    correlation plot demo


    NOTE: Using the corrplot function with the order parameter will produce errors or a wrong plot. If you need to reorder the variables (ex.: with hclust), reorder the correlation matrix before passing it to corrplot.