pythonformattingmatchingrowwise

Compare two pandas df rowwise and apply conditional formating to matching values


I think its a small problem, but I did not succeed with a code solution. I have two dataframes df_diff and df_all. Both pandas df have the same key column (e.g. "Key") but different column names.

The code should iterate over rows of df_diff, take the key value, look up the row with the key value in df_all, and then iterate over all cells of this row of df_diff and search if any of the cells matches one cell value of the corresponding row in df_all.

If there is a match, this cell should receive red background color.

Note that the column names a different between these data frames, except the Key column.

Here is an example input: df_diff

Key Column_1 Column 2
Key2 Value2 Value3
Key3 Value3 Value4
Key4 Value5 Value6

df_all

Key Column_all_A Column_all_B
Key2 Value8 Value2
Key3 Value3 Value10
Key6 Value0 Value11

The Expected Output: Expected Output of df_all with conditioned formatting


Solution

  • Here is one of the options that uses a mapper with a listcomp to build the styles :

    lstyles = [
        ["background-color:lightcoral" # <-- adjust the color here
        if v in df_diff.set_index("Key").T.to_dict("list").get(k, []) else ""
        for v in vals] for k, *vals in df_all.values
    ]
    
    use_cols = df_all.columns.difference(["Key"])
    
    out = (
        df_all.style.apply(lambda _: pd.DataFrame(lstyles, columns=use_cols),
                           axis=None, subset=use_cols)
    )
    

    Output :

    enter image description here