I am trying to make a heatmap of a dissimilar matrix that a lot of NAs. However, I ran into problems when trying to perform clustering. Without clustering the heatmap works fine. I do not want to impute/remove the NAs. Is there anyway to perform clustering? I understand that with NAs calculating distance is a problem but there should be a way around it, right?
I get the following error message:
" Error in hclust(get_dist(submat, distance), method = method) : NA/NaN/Inf in foreign function call (arg 10)
In addition: Warning message: NA exists in the matrix, calculating distance by removing NA values."
Edit:
The data I am using is an unusual matrix with a lot of NAs. Perhaps this is the problem? But I would like to visualize these NAs in the heatmap as well. So only cluster rows but not the columns.
Okay, I managed to solve this problem. I had to do simple imputation. I just replaced all NAs with a "constant".
Then I can visualize the entire dataset without removing any samples or rows, cluster both rows and columns. Then, when I want to plot where the NAs are in the dataset, I just had to give the "constant" a specific colour in any plot.
In this way, I treat all the NAs the same without assigning NAs in each row/column a value based on other samples (such as mean/median/regression methods). This method works best for my dataset without skewing them in any direction.