pandasdataframestyling

Change header names after applying styles in Pandas data frame


When you prepare a dataframe for display the order usually is:

  1. Manipulate and select data (rows and columns)
  2. Apply styles using the .style feature
  3. Display

But often tables columns have programming and database friendly names like, e.g., "count", "voltage" etc. but when showing the results you like to add more descriptive headers or even translate to different languages. This should be done at the very end and not affect the program logic. But unfortunately you cannot change header names using the .rename function on df style objects / output, and if you use the .data property you lose styling. This means you have to change header / column names before applying styling, which means your styling code will now have to address columns by either column number (not robust or meaningful) or by the final renamed display name (not robust or feasible either). And if you change display headers then you have to change your program logic ... i.e., you cannot make dynamic internationalization.

Is this correctly understood, or is there a way - or does this qualify for a feature request: adding the rename() function to df.style object?

kr Hoegge


Solution

  • The Styler's "equivalent" of .rename(mapper, axis=1) is .format_index(mapper.get, axis=1) :

    colnames = {"A": "Temperature (°C)", "B": "Humidity (%)", "C": "Pressure (hPa)"}
    
    (
        weather.style
        # do your stuff here
        .background_gradient(cmap="coolwarm")
        .set_properties(**{"text-align": "center", "width": "110px"})
        # then rename the header
        .format_index(colnames.get, axis=1)
    )

    NB: If you want to rename only a few columns, use lambda x: mapper.get(x, x) as the formatter. Also, it is worth mentioning that you can pass an exhaustive list of the new columns names (here, mapper.values()) to relabel_index.

    enter image description here

    Used input :

    np.random.seed(1)
    
    weather = pd.DataFrame(
        np.c_[
            np.random.uniform(0, 35, 10),
            np.random.uniform(20, 100, 10),
            np.random.uniform(980, 1050, 10),
        ],
        index=pd.date_range(start="20240101", periods=10).date,
        columns=[*"ABC"],
    )