I'm looking to replicate this correlation plot, or at least get as close as possible to it.
Specifically, I want:
I have gotten relatively close, but have not managed precise enough replication. I describe my closest attempts below with reproducible code. The corrplot
package has gotten me closest.
# general preparation
library(car)
correlations = cor(mtcars)
library(corrplot)
corrplot.mixed(correlations,
upper = "number", #upper.col = ???
lower = "circle", #lower.col = ???
tl.pos = "lt", tl.col = "black", tl.cex = 0.5)
Notes:
mtcar
), the coefficient colours don't match with the actual correlation values. I cannot give a reproducible code example here, because it works fine with the mtcar
data. source("http://www.sthda.com/upload/rquery_cormat.r")
rquery.cormat(mtcar)
library("ggcorrplot")
# circles separate
ggcorrplot(correlations, # correlation matrix
method = "circle", # circles instead of squares
type = "upper", # show only upped triangle
show.diag = F, # don't show diagonal values (1)
lab = F, # don't show cor coeffs
outline.col = "white", # no outline of circles
ggtheme = theme_bw, # theme
colors = c("#440154FF","#238A8DFF","#FDE725FF"))
# coefs separate
ggcorrplot(correlations, # correlation matrix
method = "circle", # circles instead of squares
type = "upper", # show only upped triangle
show.diag = F, # don't show diagonal values (1)
lab = T, # don't show cor coeffs
outline.col = NA, # don't show circles
ggtheme = theme_bw, # theme
colors = c("#440154FF","#238A8DFF","#FDE725FF"))
# can't combine both plots?
library(corrgram)
corrgram(correlations,
labels = indices_all,
lower.panel = "panel.fill",
upper.panel = "panel.cor")
Some other notes:
Out-of-the-box options are quick and nice. However, when it comes to customizing then IMHO it may be worthwhile to build up the plot from scratch using ggplot2
. As a first step this involves some data wrangling to get you correlation matrix into the right shape. Also in this step I convert the categories to factor
s and a numeric id. Based on the ids I split the data in the upper and lower diagonal values which could then be plotted separately using a geom_point
and a geom_text
. Besides that it's important to add the drop=FALSE
to the x and y scale to keep all factor levels and the right order. Also I use some functions to get the desired axis labels:
EDIT: Following the suggestion by @AllanCameron I added a coord_equal
as the "final" touch to get a nice square matrix like look. And Thanks to @RichtieSacramento the code now maps the absolute value on the size
aes.
library(dplyr)
library(tidyr)
library(ggplot2)
correlations = cor(mtcars)
levels <- colnames(mtcars)
corr_long <- correlations %>%
data.frame() %>%
mutate(row = factor(rownames(.), levels = levels),
rowid = as.numeric(row)) %>%
pivot_longer(-c(row, rowid), names_to = "col") %>%
mutate(col = factor(col, levels = levels),
colid = as.numeric(col))
ggplot(corr_long, aes(col, row)) +
geom_point(aes(size = abs(value), fill = value),
data = ~filter(.x, rowid > colid), shape = 21) +
geom_text(aes(label = scales::number(value, accuracy = .01), color = abs(value)),
data = ~filter(.x, rowid < colid), size = 8 / .pt) +
scale_x_discrete(labels = ~ attr(.x, "pos"), drop = FALSE) +
scale_y_discrete(labels = ~ paste0(.x, " (", attr(.x, "pos"), ")"), drop = FALSE) +
scale_fill_viridis_c(limits = c(-1, 1)) +
scale_color_gradient(low = grey(.8), high = grey(.2)) +
coord_equal() +
guides(size = "none", color = "none") +
theme(legend.position = "bottom",
panel.grid = element_blank(),
axis.ticks = element_blank()) +
labs(x = NULL, y = NULL, fill = NULL)