I have the following effective dated table with several thousands records that would need to be updated/inserted based on certain criteria.
Table Car_Specification
:
Effective_Date | Car_ID | Num_Wheels | Num_Doors | Engine_Type | Color |
---|---|---|---|---|---|
01-MAR-2021 | C001 | 4 | 2 | Turbo | Red |
01-APR-2021 | C001 | 4 | 2 | Turbo | Blue |
01-JUN-2021 | C001 | 4 | 2 | Turbo | Green |
Lookup Car_Change
:
Effective_Date | Car_ID | Num_Wheels | Num_Doors | Engine_Type | Color |
---|---|---|---|---|---|
01-MAR-2021 | C001 | 4 | 2 | Turbo | White |
01-MAY-2021 | C001 | 4 | 2 | Manual | Red |
Every day, changes to a car is captured inside the Car_Change
table and the updates will need to be updated into Car_Specification
table based on Car_ID
and Effective_Date
.
In the example above, I would like to query Car_Specification
for the list of effective_date
and car_id
and perform a lookup into Car_Change
table, if a match based on effective_date
and car_id
is found, I will update the data in the Car_Change
with the data in the Car_Specification
table for those fields that matches. The changes will also need to be forward filled or propagated forward.
If the effective_date
and car_id
doesn't match but there is a change in Car_Change
, I will need to insert the new effective_date and car_id
into Car_Specification
and forward fill the changes.
For instance, C001 will show the following after the update:
Effective_Date | Car_ID | Num_Wheels | Num_Doors | Engine_Type | Color |
---|---|---|---|---|---|
01-MAR-2021 | C001 | 4 | 2 | Turbo | White |
01-APR-2021 | C001 | 4 | 2 | Turbo | White |
01-MAY-2021 | C001 | 4 | 2 | Manual | Red |
01-JUN-2021 | C001 | 4 | 2 | Manual | Red |
I am thinking along the line of using the MERGE UPDATE/INSERT construct to handle the changes. Would like to seek advice on the most efficient way to handle such changes given the tables are large and I am looking to do the update/insert operation on a daily basis to keep my data inside Car_Specification
updated. If I am going ahead with MERGE, how do I handle the forward filling of data?
As a frame challenge. Several thousands of records is a tiny data set; you do not need to merge the data. Just have a single row for each car in Car_Specification
for the original specification and then if there are changes put them into Car_Changes
and when you want to display the latest value then select the last change, or the specification if there were no changes.
SELECT COALESCE(c.effective_date, s.effective_date) AS effective_date,
s.car_id,
COALESCE(c.num_wheels, s.num_wheels) AS num_wheels,
COALESCE(c.num_doors, s.num_doors) AS num_doors,
COALESCE(c.engine_type, s.engine_type) AS engine_type,
COALESCE(c.color, s.color) AS color
FROM car_specification s
LEFT OUTER JOIN LATERAL(
SELECT *
FROM car_changes c
WHERE c.car_id = s.car_id
AND c.effective_date >= s.effective_date
ORDER BY c.effective_date DESC
FETCH FIRST ROW ONLY
)
ON (1 = 1) -- The join condition is inside the lateral join.
If you want you can wrap it in a view.