I have the following two tables
I am using the following query in Snowflake to join the tables:
SELECT
Packed_Cases.Box_no,
Packed_Cases.Item_Style,
Packed_Cases.Qty.,
Outbound_Cases.Box_no,
Outbound_Cases.Store,
MAX_BY(Outbound_Cases.Date_Modified,Outbound_Cases.Date_Modified) AS Date_Modified,
Outbound_Cases.SYS_State
FROM
Packed_Cases
LEFT JOIN
Outbound_Cases ON Packed_Cases.Box_no = Outbound_Cases.Box_no
WHERE
Outbound_Cases.Store LIKE '%15%' AND
Outbound_Cases.SYS_State = '50'
GROUP BY
Packed_Cases.Box_no, Packed_Cases.Item_Style, Packed_Cases.Qty.,
Outbound_Cases.Box_no, Outbound_Cases.Store, Outbound_Cases.SYS_State
After running the query I am getting the following output which has duplicate lines highlighted in yellow. I am using the MAX_BY
function on Outbound_Cases.Date_Modified
to filter the similar rows and pick the row with latest date with matching Box_no
to Packed_cases
table.
I am expecting the following output where the Box_no
from the Packed_cases
table will pull the store number, date modified and Sys_state
from Outbound_Cases
table with latest date modified.
I have first picked the latest row for box_no
based on latest date_modified
date, you can test and let me know if it works, I was unable to run the query as the sample input data is not in text format in the question.
Here is the sample query
with latest_outbound_cases as (
select
box_no,
store,
max(date_modified) as latest_date_modified,
sys_state
from outbound_cases
where store like '%15%'
and sys_state = '50'
group by box_no, store, sys_state
)
select
packed_cases.box_no,
packed_cases.item_style,
packed_cases.qty,
latest_outbound_cases.store,
latest_outbound_cases.latest_date_modified as date_modified,
latest_outbound_cases.sys_state
from packed_cases
left join latest_outbound_cases
on packed_cases.box_no = latest_outbound_cases.box_no;