I need to make a heatmap with the function 'pheatmap', using UPGMA and 1-pearson correlation as distance metric. My professor claims this is the default distance metric, although in my case it uses 'Euclidian' as distance metric. Is euclidian and 1 - pearson correlation the same or is he wrong? If he's wrong how can I use the correct distance metric for my heatmap?
My input
ph=pheatmap(avgreltlog10, color = colorRampPalette(rev(brewer.pal(n = 7,
name = "RdYlBu")))(100),
kmeans_k = NA, breaks = NA, border_color = "grey60",
cellwidth = 10, cellheight=10, scale = "none", cluster_rows=TRUE,
clustering_method = "average", cutree_rows = 4, cutree_cols= 2,)
R output
$tree_row
Call:
hclust(d = d, method = method)
Cluster method : average
Distance : euclidean
Number of objects: 65
$tree_col
Call:
hclust(d = d, method = method)
Cluster method : average
Distance : euclidean
Number of objects: 10
You can check the default settings easily by typing the function name without () in your terminal
>pheatmap
If you do that you can see that euclidean is used as default:
... clustering_distance_rows = "euclidean", clustering_distance_cols = "euclidean", clustering_method = "complete", ...
To use 1-pearson correlation, simply specify it as such:
cluster_rows = TRUE,
clustering_distance_rows = "correlation"
It works because, once again, if you dig into the code you can see that it calls for cluster_mat, which does this:
cluster_mat = function(mat, distance, method){
...
if(distance[1] == "correlation"){
d = as.dist(1 - cor(t(mat)))
}
...
More info in the official document. There are so many packages around that it's not uncommon to mix things up :)