Is there a way to retain both the left and right join keys when performing a left join using the Python Polars library? Currently it seems that only the left join key is retained but the right join key gets dropped.
Please see example code below:
import polars as pl
df = pl.DataFrame(
{
"foo": [1, 2, 3],
"bar": [6.0, 7.0, 8.0],
"ham": ["a", "b", "c"],
}
)
other_df = pl.DataFrame(
{
"apple": ["x", None, "z"],
"ham": ["a", "b", "d"],
}
)
df.join(other_df, on="ham", how="left")
In other data manipulation libraries this is possible for example using the R dplyr
library you can do the below or even in standard SQL
we can access the right join key in the SELECT
clause.
library(dplyr)
df <- data.frame(
foo = c(1, 2, 3),
bar = c(6.0, 7.0, 8.0),
ham = c("a", "b", "c")
)
other_df <- data.frame(
apple = c("x", NA, "z"),
ham = c("a", "b", "d")
)
df %>%
left_join(other_df, by = c("ham" = "ham"), suffix = c("", "_right"), keep = TRUE)
join
now has a coalesce=
parameter to control this behaviour.
df.join(other_df, on="ham", how="left", coalesce=False)
shape: (3, 5)
┌─────┬─────┬─────┬───────┬───────────┐
│ foo ┆ bar ┆ ham ┆ apple ┆ ham_right │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ f64 ┆ str ┆ str ┆ str │
╞═════╪═════╪═════╪═══════╪═══════════╡
│ 1 ┆ 6.0 ┆ a ┆ x ┆ a │
│ 2 ┆ 7.0 ┆ b ┆ null ┆ b │
│ 3 ┆ 8.0 ┆ c ┆ null ┆ null │
└─────┴─────┴─────┴───────┴───────────┘