I'm using the following code to get a ComplexHeatmap in R. I tried two different methods, specified in the comments in the code. The only difference between the two is that in the first one, I rely on the ComplexHeatmap to do the clustering, while in the second case, I do the clustering myself, and pass the hclust object to Heatmap. In theory, I should be getting the same clustering for the data, but I'm not. I'm not sure what I'm misunderstanding. I know the heatmaps should not be the same, but I believe the clustering of the samples should be. I've shared the data in this Github post about this. PS: The clustering is similar (but displayed in reversed order), but not the exact same.
lung = read.csv("all_lung.csv")
lung["subtype_grouped_meso"] = lung["subtype"]
lung[lung["subtype"] == "Not.Otherwise.Specified" | lung["subtype"] ==
"Epithelioid" | lung["subtype"] == "Sarcomatoid" |
lung["subtype"] == "Biphasic",
"subtype_grouped_meso"] = "meso"
subtype = lung[["subtype_grouped_meso"]]
rownames(lung) = lung[["X"]]
lung = lung[, chr_keep]
subtype_colors <- c(
"Adeno" = "red",
"Squamous" = "green",
"SCLC" = "blue",
"meso" = "orange"
)
lung = t(lung)
column_ha <- HeatmapAnnotation(subtype = subtype,
counts = log10(colSums(lung)),
col = list(subtype =
subtype_colors))
# Method 1
Heatmap(lung,
top_annotation = column_ha,
clustering_distance_columns = "pearson",
clustering_method_columns = "ward.D",
cluster_rows = F,
show_column_names = F,
show_row_names = F,
show_row_dend = F)
# Method 2 (manually do the clustering)
cor_matrix <- cor(lung, method = "pearson")
cor_distance <- as.dist(1 - cor_matrix)
hc <- hclust(cor_distance, method = "ward.D")
Heatmap(cor_matrix,
top_annotation = column_ha,
name = "correlation",
cluster_columns = hc,
cluster_rows = hc,
show_column_names = F,
show_row_names = F,
show_row_dend = F,
col = colorRamp2(c(-1, 0, 1), c("blue", "white", "red")))
I found a similar question here. The punchline is best described by the reorder.dendogram
function from the stats
package:
There are many different orderings of a dendrogram that are consistent with the structure imposed. This function takes a dendrogram and a vector of values and reorders the dendrogram in the order of the supplied vector, maintaining the constraints on the dendrogram.