pandasdataframedictionarymatrixsymmetric

Efficient way to create dictionary of symmetric matrix with colum-row pair as key, and corresponding value in matrix as value


I want to create a dictionary in the form of (row, column): value, from a symmetric matrix (like a distance matrix) as depicted below, whithout taking into account the NaN values or zeros (zeros is the diagonal). The matrix is a pandas dataframe.

Material  100051    100120    100138    100179    100253    100265    100281                                                                         
100051       0.0  0.953488  0.959302  0.953488  0.959302  0.953488  0.953488   
100120       NaN  0.000000  0.965116  0.953488  0.959302  0.959302  0.959302   
100138       NaN       NaN  0.000000  0.959302  0.970930  0.970930  0.970930   
100179       NaN       NaN       NaN  0.000000  0.959302  0.953488  0.953488   
100253       NaN       NaN       NaN       NaN  0.000000  0.976744  0.976744   
...          ...       ...       ...       ...       ...       ...       ... 

So a dictionary that looks like:

{((100120, 100051): 0.953488); ((1000138, 100051): 0.959302); ....}

For creating a dictionary, you can probably iterate over both rows and columns like:

jacsim_values = {}
for i in jacsim_matrix2:
    for j in jacsim_matrix2:
        if jacsim_matrix[i][j] != 0:
            jacsim_values[i,j] = jacsim_matrix2[i][j]

But I am looking for something more efficient. This takes quite some time for the size of the matrix. However, I could not find how to do so. Is there somebody who can help me out?


Solution

  • IIUC, DataFrame.stack (row, column) or DataFrame.unstack (column, row) + DataFrame.to_dict

    df.set_index('Material').rename(int, axis=1).unstack().to_dict()
    

    {(100051, 100051): 0.0,
     (100051, 100120): nan,
     (100051, 100138): nan,
     (100051, 100179): nan,
     (100051, 100253): nan,
     (100120, 100051): 0.9534879999999999,
     (100120, 100120): 0.0,
     (100120, 100138): nan,
     (100120, 100179): nan,
     (100120, 100253): nan,
     (100138, 100051): 0.9593020000000001,
     (100138, 100120): 0.965116,
     (100138, 100138): 0.0,
     (100138, 100179): nan,
     (100138, 100253): nan,
     (100179, 100051): 0.9534879999999999,
     (100179, 100120): 0.9534879999999999,
     (100179, 100138): 0.9593020000000001,
     (100179, 100179): 0.0,
     (100179, 100253): nan,
     (100253, 100051): 0.9593020000000001,
     (100253, 100120): 0.9593020000000001,
     (100253, 100138): 0.97093,
     (100253, 100179): 0.9593020000000001,
     (100253, 100253): 0.0,
     (100265, 100051): 0.9534879999999999,
     (100265, 100120): 0.9593020000000001,
     (100265, 100138): 0.97093,
     (100265, 100179): 0.9534879999999999,
     (100265, 100253): 0.9767440000000001,
     (100281, 100051): 0.9534879999999999,
     (100281, 100120): 0.9593020000000001,
     (100281, 100138): 0.97093,
     (100281, 100179): 0.9534879999999999,
     (100281, 100253): 0.9767440000000001}