pythonjoinpython-polars

How do I keep both columns as the result of a Polars join with left_on and right_on?


How do I keep both columns as the result of a Polars join with left_on and right_on, like in SQL?

df1 = pl.DataFrame({"a": ["x","y","z"], "lk": [1,2,3]})
df2 = pl.DataFrame({"b": ["x","y","z"], "rk": [1,2,3]})

SQL analogy:

SELECT *
FROM tbl1
INNER JOIN tbl2
  ON tbl1.lk = tbl2.rk

The usual result (currently undocumented) only keeps the left column. If I do a .select("rk") afterwards, that column is already gone.

>>> print(df1.join(df2, how="inner", left_on="lk", right_on="rk"))
shape: (3, 3)
┌─────┬─────┬─────┐
│ a   ┆ lk  ┆ b   │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞═════╪═════╪═════╡
│ x   ┆ 1   ┆ x   │
│ y   ┆ 2   ┆ y   │
│ z   ┆ 3   ┆ z   │
└─────┴─────┴─────┘

Solution

  • Use coalesce=False.

    >>> print(df1.join(df2, how="inner", left_on="lk", right_on="rk", coalesce=False))
    shape: (3, 4)
    ┌─────┬─────┬─────┬─────┐
    │ a   ┆ lk  ┆ b   ┆ rk  │
    │ --- ┆ --- ┆ --- ┆ --- │
    │ str ┆ i64 ┆ str ┆ i64 │
    ╞═════╪═════╪═════╪═════╡
    │ x   ┆ 1   ┆ x   ┆ 1   │
    │ y   ┆ 2   ┆ y   ┆ 2   │
    │ z   ┆ 3   ┆ z   ┆ 3   │
    └─────┴─────┴─────┴─────┘
    

    Incidentally, SQL has (poorly named) COALESCE which does something completely different.