pythonformatconditional-statementsstylesbold

Making column values bold based on other columns in python


I am trying to figure out how to make column values in my pandas dataframe based on the values in another column, specifically concerning regression outputs and making coefficients that correspond to a low p-value bold. I am using python.

I have the following dataframe containing regression output results:

Model_run    Param_1    Param_2    Param_3   Param_1_pv   Param_2_pv   Param_3_pv 
----------------------------------------------------------------------------------- 
1               .453       .756       .945          .06          .04          .03 
2               .452       .336       .342          .04          .03          .22 
3               .264      .443        .044          .33          .05          .32 
4               .356       .543       .033          .01          .05          .14 
5               .234      .435        .032          .04          .03          .09 
... 

where the "Param" columns refer to coefficients for each parameter (independent variable) and pv means "P-value" for each parameter independent variable. And so I want to make the coefficient values corresponding to a p-value of <= 0.05 bold. I would show the output I want, but I don't think the code formatting here lets you use boldening.

How can I accomplish this in python? I have seen that I can use .style.applymap(), but I am confused about how to format the coefficients based on their corresponding p-value columns, since most tutorials just show how to format the columns themselves.


Solution

  • You can use pandas.DataFrame.style.apply with a custom function.

    def bold(v, props="", threshold=0.05):
        if v.name + "_pv" not in df:
            return np.full(v.shape, "")
        return np.where(df[v.name + "_pv"].le(threshold), props, "") 
    
    df.style.apply(bold, props="font-weight: bold", axis=0, threshold=0.05)
    

    Workaround:

    def bold(v, threshold=0.05):
        if v.name + "_pv" not in df:
            return np.full(v.shape, "")
        return np.where(df[v.name + "_pv"].le(threshold), "font-weight: bold", "") 
    
    df.style.apply(bold, axis=0, threshold=0.05)
    

    Output:

    enter image description here

    Insight:

    The function seeks for a column suffixed with _pv and then do the comparison of that column against threshold.

    If no column with _pv found, it will return an empty-string array to say that specific cell does not require styling (in this case, font-weight: bold).

    If column with _pv found, it will apply numpy.where to make an array mixed with props and "" to filter out cells for which styling needs to be applied.