pythonplotlyheatmappython-polars

How to create a heatmap from a tidy / long polars dataframe


I need to create a heatmap on the basis of a tidy/long pl.DataFrame. Consider the following example, where I used pandas and plotly to create a heatmap.

import plotly.express as px
import polars as pl

tidy_df_pl = pl.DataFrame(
    {
        "x": [10, 10, 10, 20, 20, 20, 30, 30, 30],
        "y": [3, 4, 5, 3, 4, 5, 3, 4, 5],
        "value": [5, 8, 2, 4, 10, 14, 10, 8, 9],
    }
)

print(tidy_df_pl)

shape: (9, 3)
┌─────┬─────┬───────┐
│ x   ┆ y   ┆ value │
│ --- ┆ --- ┆ ---   │
│ i64 ┆ i64 ┆ i64   │
╞═════╪═════╪═══════╡
│ 10  ┆ 3   ┆ 5     │
│ 10  ┆ 4   ┆ 8     │
│ 10  ┆ 5   ┆ 2     │
│ 20  ┆ 3   ┆ 4     │
│ 20  ┆ 4   ┆ 10    │
│ 20  ┆ 5   ┆ 14    │
│ 30  ┆ 3   ┆ 10    │
│ 30  ┆ 4   ┆ 8     │
│ 30  ┆ 5   ┆ 9     │
└─────┴─────┴───────┘

Transforming to a wide pd.DataFrame:

pivot_df_pd = (
    tidy_df_pl.pivot(index="x", on="y", values="value").to_pandas().set_index("x")
)
print(pivot_df_pd)

     3   4   5
x             
10   5   8   2
20   4  10  14
30  10   8   9

Creating the heatmap using plotly.

fig = px.imshow(pivot_df_pd)
fig.show()

enter image description here

This all seems a bit cumbersome. I am looking for polars-only. How can I create this heatmap directly from polars without going through a third library?


Solution

  • Here is the heatmap without the additional column (like the above answer has). It is the same as your pandas output.

    fig = px.imshow(pivot_df_pl.drop("x"), y=pivot_df_pl["x"])
    fig.show()
    

    Polars pivoted df heatmap with Plotly

    It does seem that Plotly handles pandas indexes specifically as the y-axis. So, there is a tiny bit more to do here, but it is pure Polars.

    If plotly really were to handle polars data natively, I would expect it can handle tidy dataframes, i.e. no need for pivot.

    It does look like this is possible. It also looks to work the same for pandas when wanting to create a heatmap from a tidy df.

    import plotly.graph_objects as go
    
    fig2 = go.Figure(
        go.Heatmap(
            x=tidy_df_pl["y"],
            y=tidy_df_pl["x"],
            z=tidy_df_pl["value"],
        )
    )
    # switch the y-axis to align with previous output
    fig2.update_layout(yaxis_autorange="reversed")
    fig2.show()
    

    Polars tidy df heatmap with Plotly


    Another library that also can handle the tidy format (which also happens to be the .plot namespace for Polars dataframes) is Altair. Here is a very similar output using Altair

    import altair as alt
    
    (
        tidy_df_pl.plot.rect(
            x="y:O",
            y="x:O",
            # use the plotly theme
            # if not wanted, just write `color="value:Q"` instead
            color=alt.Color("value:Q", scale=alt.Scale(scheme="plasma")),
        )
        .properties(width=500, height=400)
    )
    

    Polars tidy df heatmap with Altair