dictionarypivot-tableheat

PyComplexHeatmap ValueError: The condensed distance matrix must contain finite values


I am currently working on creating heatmaps using the PyComplexHeatmap package in Python. I have a dataframe that I want to use to build heatmaps, and I encounter an error when attempting to plot the heatmap. The error message states:

Starting plotting..
Starting calculating row orders..
Reordering rows...
ValueError: The condensed distance matrix must contain only finite values.

This is my code:

import pandas as pd 
import PyComplexHeatmap

data = {
    'Geneid': ['K20859', 'K16698', 'K20859', 'K03781', 'K07452', 'K19147', 'K16698', 'K16698', 'K03781', 'K16698'],
    'Diagnosis': ['iRBD', 'iRBD', 'PD', 'PD', 'PD', 'PD', 'Ctrl', 'PD', 'PD', 'PD'],
    'G': ['DTU008', 'Methanosphaera', 'Methanomassiliicoccus_A', 'Methanomethylophilus', 'Methanomethylophilus', 'Methanomethylophilus', 'Methanosphaera', 'Methanobrevibacter_A', 'Methanomassiliicoccus_A', 'Methanosphaera'],
    'tpm': [0.384566, 0.614127, 1.264605, 1.361017, 1.536711, 1.727445, 2.444317, 2.745661, 3.101456, 3.288112]
}

df_G_level = pd.DataFrame(data)

pivot_tables = {}
diagnosis_values = df_G_level['Diagnosis'].unique()

for diagnosis in diagnosis_values:
    filtered_df = df_G_level[df_G_level['Diagnosis'] == diagnosis]
    pivot_table = filtered_df.pivot_table(index='Geneid', columns='G', values='tpm', aggfunc='sum', fill_value=1e-6)
    pivot_table = pivot_table.reindex(index=df_G_level['Geneid'].unique(), columns=df_G_level['G'].unique(), fill_value=a)
    pivot_tables[diagnosis] = pivot_table

df_Ctrl = pivot_tables['Ctrl']

row_ha = HeatmapAnnotation(selected=anno_label(df_Ctrl.index.to_frame(), colors='black'), axis=0, verbose=0, orientation='right')

cm1 = ClusterMapPlotter(data=df_Ctrl, left_annotation=None, show_rownames=True, show_colnames=True, row_dendrogram=False, col_dendrogram=False, cmap='Purples', rasterized=True, row_split_gap=0.1, center=0.5, plot=True, label='tpm')

I have already ensured that there are no infinite values in the matrix because I substitute non-existing values with 1e-6 during the pivot table creation. However, I am still encountering the mentioned error. Could you please help me identify the problem and provide a possible solution?


Solution

  • If you set col_cluster=False, row_cluster=False, you could run this code successfully. enter image description here

    You got an error because there are many columns having the same values, so you can not calculate the linkage.