pythondataframepairwise

Pairwise row difference - Python


I would like to create a matrix showing deviation by pairwise comparison of columns. This is the head of my dataframe. My dataframe consists of 1660 rows and 8 columns (A:alternatives, C:criteria).

     C1     C2     C3
A1  1996    36     0.18
A2  1373    15     0.53
A3  1472    19     0.29

First I want to create two new columns. The first one consists of the names of my alternatives and the second one the difference between the alternatives. The result should look like this for each criterion:

         C1  C2  C3
A1-A2   623  .
A1-A3   524  .
A2-A1  -623
A2-A3  -99
A3-A1  -524
A3-A2   99

Im using Python languange and dataframes.


Solution

  • You can do a cartesian product of df index with pd.MultiIndex.from_product then filter the same value. Then index the df with the MultiIndex.

    index = pd.MultiIndex.from_product([df.index, df.index])
    index = index[index.get_level_values(0) != index.get_level_values(1)]
    out = pd.DataFrame(df.loc[index.get_level_values(0)].to_numpy() - df.loc[index.get_level_values(1)].to_numpy(),
                       index=index.map('{0[0]}-{0[1]}'.format),
                       columns=df.columns)
    
    print(out)
    
              C1    C2    C3
    A1-A2  623.0  21.0 -0.35
    A1-A3  524.0  17.0 -0.11
    A2-A1 -623.0 -21.0  0.35
    A2-A3  -99.0  -4.0  0.24
    A3-A1 -524.0 -17.0  0.11
    A3-A2   99.0   4.0 -0.24