pythonpandasnumpypandas-styles

How to highlight non-main diagonal elements


I was wondering how to highlight diagonal elements of pandas DataFrame using df.style methods.

I already found out how to do it with the main diagonal, but can't manage to highlight the one which starts from the second column, f.e.

import numpy as np
import pandas as pd

df = pd.DataFrame({'a':[1,2,3,4],'b':[1,3,5,7],'c':[1,4,7,10],'d':[1,5,9,11]})

def style_diag(data):
    diag_mask = pd.DataFrame("", index=df.index, columns=df.columns)
    min_axis = min(diag_mask.shape)
    diag_mask.iloc[range(min_axis), range(min_axis)] = 'background-color: yellow'
    return diag_mask

df.style.apply(style_diag, axis=None)

This gives following output:

table with highlighting along the main diagonal

(but actually I don't really get the magic in this function)

And I'd like to have a yellow highlight across the diagonal elements 1, 4, 9.

How can I do that?


Solution

  • There are certainly more than a few options here depending on the exact needs. One approach would be to create a mask of the same shape as your DataFrame with the diagonals at the desired offset filled with Trues to conditionally apply styles.

    The approach and usage

    def style_diag(df_: pd.DataFrame, offset: int = 0) -> pd.DataFrame:
        # Create empty styles DataFrame
        style_df = pd.DataFrame('', index=df_.index, columns=df_.columns)
    
        # Create a 2D False mask
        mask = np.zeros(df_.shape, dtype=bool)
    
        # Find diagonal indices at an offset and replace values with True
        rows, cols = np.indices(mask.shape)
        mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
    
        # Set diagonal styles using mask
        style_df[mask] = 'background-color:yellow'
        return style_df
    

    This can be used like:

    df.style.apply(style_diag, offset=1, axis=None)
    

    Which produces the following results:

    Styled DataFrame with offset 1 diagonal highlighted yellow

    Similarly this can be used without an offset to produce the original output:

    df.style.apply(style_diag, axis=None)
    

    Styled DataFrame with main diagonal highlighted yellow

    Or even with negative offsets:

    df.style.apply(style_diag, offset=-2, axis=None)
    

    Styled DataFrame with offset -2 diagonal highlighted yellow

    How it works

    We start with an empty False mask of the same shape as our DataFrame:

    mask = np.zeros(df_.shape, dtype=bool)
    
    # array([[False, False, False, False],
    #        [False, False, False, False],
    #        [False, False, False, False],
    #        [False, False, False, False]])
    

    From here we need to find the diagonal indices in order to replace the values on the diagonal with True. There is a function np.diag_indices_from, however, unfortunately this does not directly support offset diagonals.

    Let's instead grab the indices for this mask using np.indices

    rows, cols = np.indices(mask.shape)
    
    # rows
    # array([[0, 0, 0, 0],
    #        [1, 1, 1, 1],
    #        [2, 2, 2, 2],
    #        [3, 3, 3, 3]])
    # cols
    # array([[0, 1, 2, 3],
    #        [0, 1, 2, 3],
    #        [0, 1, 2, 3],
    #        [0, 1, 2, 3]])
    

    We can now use the np.diag function on both rows and cols which does natively support offsets (k). (For this example, offset is 1)

    np.diag(rows, k=offset)
    # array([0, 1, 2])
    
    np.diag(cols, k=offset)
    # array([1, 2, 3])
    

    We can use the results from diag as indexers to update our mask

    mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
    
    # array([[False,  True, False, False],
    #        [False, False,  True, False],
    #        [False, False, False,  True],
    #        [False, False, False, False]])
    

    Now we have a well formatted mask that can be used easily apply style strings.

    style_df[mask] = 'background-color:yellow'
    
    #   a                        b                        c                        d
    # 0    background-color:yellow                                                  
    # 1                             background-color:yellow                         
    # 2                                                      background-color:yellow
    # 3                                                                             
    

    Complete working example with imports and version numbers used

    import numpy as np  # v1.26.2
    import pandas as pd  # v2.1.4
    
    df = pd.DataFrame({
        'a': [1, 2, 3, 4],
        'b': [1, 3, 5, 7],
        'c': [1, 4, 7, 10],
        'd': [1, 5, 9, 11]
    })
    
    
    def style_diag(df_: pd.DataFrame, offset: int = 0) -> pd.DataFrame:
        style_df = pd.DataFrame('', index=df_.index, columns=df_.columns)
        mask = np.zeros(df_.shape, dtype=bool)
        rows, cols = np.indices(mask.shape)
        mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
        style_df[mask] = 'background-color:yellow'
        return style_df
    
    
    df.style.apply(style_diag, offset=1, axis=None)
    
    

    Styled DataFrame with offset 1 diagonal highlighted yellow