I have this code:
import polars as pl
df1 = pl.DataFrame({
'type': ['A', 'O', 'B', 'O'],
'origin': ['EU', 'US', 'US', 'EU'],
'qty1': [343,11,22,-5]
})
df2 = pl.DataFrame({
'type': ['A', 'O', 'B', 'S'],
'origin': ['EU', 'US', 'US', 'AS'],
'qty2': [-200,-12,-25,8]
})
df1.join(df2, on=['type', 'origin'], how='full')
which gives
┌──────┬────────┬──────┬────────────┬──────────────┬──────┐
│ type ┆ origin ┆ qty1 ┆ type_right ┆ origin_right ┆ qty2 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ str ┆ str ┆ i64 │
╞══════╪════════╪══════╪════════════╪══════════════╪══════╡
│ A ┆ EU ┆ 343 ┆ A ┆ EU ┆ -200 │
│ O ┆ US ┆ 11 ┆ O ┆ US ┆ -12 │
│ B ┆ US ┆ 22 ┆ B ┆ US ┆ -25 │
│ null ┆ null ┆ null ┆ S ┆ AS ┆ 8 │
│ O ┆ EU ┆ -5 ┆ null ┆ null ┆ null │
└──────┴────────┴──────┴────────────┴──────────────┴──────┘
But the output I am after is this:
┌──────┬────────┬──────┬──────┐
│ type ┆ origin ┆ qty1 ┆ qty2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ i64 │
╞══════╪════════╪══════╪══════╡
│ A ┆ EU ┆ 343 ┆ -200 │
│ O ┆ US ┆ 11 ┆ -12 │
│ B ┆ US ┆ 22 ┆ -25 │
│ S ┆ AS ┆ null ┆ 8 │
│ O ┆ EU ┆ -5 ┆ null │
└──────┴────────┴──────┴──────┘
I tried suffix=''
via df1.join(df2, on=['type', 'origin'], how='full', suffix='')
, but this raises an error:
DuplicateError: unable to hstack, column with name "type" already exists
How can I achieve this?
You are looking for the coalesce
parameter. Setting it to True
gives the desired result.
df1.join(df2, on=['type', 'origin'], how='full', coalesce=True)
shape: (5, 4)
┌──────┬────────┬──────┬──────┐
│ type ┆ origin ┆ qty1 ┆ qty2 │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ i64 │
╞══════╪════════╪══════╪══════╡
│ A ┆ EU ┆ 343 ┆ -200 │
│ O ┆ US ┆ 11 ┆ -12 │
│ B ┆ US ┆ 22 ┆ -25 │
│ S ┆ AS ┆ null ┆ 8 │
│ O ┆ EU ┆ -5 ┆ null │
└──────┴────────┴──────┴──────┘
From the documentation of pl.DataFrame.join
.
coalesce
Coalescing behavior (merging of join columns).
None
: -> join specific.
True
: -> Always coalesce join columns.
False
: -> Never coalesce join columns.Note that joining on any other expressions than
col
will turn off coalescing.