pythondataframeleft-joinpython-polars

Is there a way to retain keys in Polars left join


Is there a way to retain both the left and right join keys when performing a left join using the Python Polars library? Currently it seems that only the left join key is retained but the right join key gets dropped.

Please see example code below:

import polars as pl

df = pl.DataFrame(
    {
        "foo": [1, 2, 3],
        "bar": [6.0, 7.0, 8.0],
        "ham": ["a", "b", "c"],
    }
)

other_df = pl.DataFrame(
    {
        "apple": ["x", None, "z"],
        "ham": ["a", "b", "d"],
    }
)

df.join(other_df, on="ham", how="left")

In other data manipulation libraries this is possible for example using the R dplyr library you can do the below or even in standard SQL we can access the right join key in the SELECT clause.

library(dplyr)

df <- data.frame(
  foo = c(1, 2, 3),
  bar = c(6.0, 7.0, 8.0),
  ham = c("a", "b", "c")
)

other_df <- data.frame(
  apple = c("x", NA, "z"),
  ham = c("a", "b", "d")
)

df %>%
  left_join(other_df, by = c("ham" = "ham"), suffix = c("", "_right"), keep = TRUE)

Solution

  • join now has a coalesce= parameter to control this behaviour.

    df.join(other_df, on="ham", how="left", coalesce=False)
    
    shape: (3, 5)
    ┌─────┬─────┬─────┬───────┬───────────┐
    │ foo ┆ bar ┆ ham ┆ apple ┆ ham_right │
    │ --- ┆ --- ┆ --- ┆ ---   ┆ ---       │
    │ i64 ┆ f64 ┆ str ┆ str   ┆ str       │
    ╞═════╪═════╪═════╪═══════╪═══════════╡
    │ 1   ┆ 6.0 ┆ a   ┆ x     ┆ a         │
    │ 2   ┆ 7.0 ┆ b   ┆ null  ┆ b         │
    │ 3   ┆ 8.0 ┆ c   ┆ null  ┆ null      │
    └─────┴─────┴─────┴───────┴───────────┘