I've got a large data frame of 800 rows and 30 columns and a response vector of length 800. I'm trying to create a heatmap where I'd like to display the heatmap along with the column, side by side, but colored independently (the response variable is on a very different scale, so I don't want to use the same coloring scheme for both). Currently my heatmap looks like this (the vertical lines are added as part of my analysis, they are not part of the heatmap).
But I have a small-scale reproducible example here:
mydf <- as.matrix(data.frame(A = sample(10, 20, replace = T),
B = sample(10, 20, replace = T),
C = sample(10, 20, replace = T),
D = sample(10, 20, replace = T),
E = sample(10, 20, replace = T)))
response <- sample(100, 20, replace = T)
mydf_order <- hclust(dist(mydf))$order
heatmap(mydf[mydf_order,], Rowv = NA, Colv = NA, labRow = as.character(mydf_order))
I'm able to produce a heatmap of the hierarchically clustered mydf
, but would like to also display the response
column alongside it with an independent coloring scheme, so that I can see if the grouping in mydf
corresponds to response
Thank you
One approach is to map your response variable to a color gradient, e. g. using the {scales} package:
library(scales)
response_colors <- colour_ramp(c("blue", "red"))(response / max(response))
## > head(response_colors)
## [1] "#FA001C" "#D2007B" "#DF0062" "#AF00AD" "#E40058" "#4F00F2"
Then, use these for the RowSideColors
argument (make sure the order corresponds to that of the reordered mydf
):
heatmap(mydf[mydf_order,],
Rowv = NA,
Colv = NA,
labRow = as.character(mydf_order),
RowSideColors = response_colors
)
result: