Assuming I have two tables, one with the following columns called lease_period:
tenant_trading_name, suite_id, lease_id, building_id
and another, called lease_period_audit with the following:
audit_date, audit_type, tenant_trading_name, suite_id, lease_id, building_id
Each time a record is updated on lease_period and entry is made on lease_period_audit with a status of 'Updated'. I'm trying to find all updates made only to the tenant_trading_name field but haven't had any success. So far I have the following:
select lpa.*
from property.lease_period_audit lpa
inner join property.lease_period lp on lpa.suite_id = lp.suite_id and lpa.lease_id = lp.lease_id and lpa.building_id = lp.building_id
where audit_type = 'Updated'
and lp.tenant_trading_name <> lpa.tenant_trading_name
order by 1 desc
Where's the flaw in my thought process here? How can this be done / how should I be thinking about this?
Assuming the audit table also logs the lease_period
primary key column, referenced here lp_id
for simplicity, you could try the following approach:
Find all the rows where audit_type
is 'Updated'
.
Rank all the rows by audit_date
and partitioning them by lp_id
.
Rank the rows by audit_date
partitioning by lp_id, suite_id, lease_id, building_id
.
Get the difference between the two rankings.
Rank the rows again by audit_date
, partitioning them now by lp_id, suite_id, lease_id, building_id, (ranking_difference)
.
Output all the rows where the last ranking value is 2 or greater.
The first four steps result in a row set where each group of consecutive (in ascending order of audit_date
) rows with identical values of suite_id, lease_id, building_id
for the same lp_id
will be uniquely distinguished by a value calculated as the difference between the rankings #2 & #3.
Within the group, every row, starting from the second one, will differ from the previous one only in the value of tenant_trading_name
, which is just what we need. So, we rank the rows once again, taking into account the 'group ID' we've just obtained, then return every row with the ranking of 2 or higher.
Here's an approximate implementation:
WITH marked AS (
SELECT
*,
grp = ROW_NUMBER() OVER (PARTITION BY lp_id
ORDER BY audit_date)
- ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id
ORDER BY audit_date)
FROM lease_period_audit
WHERE audit_type = 'Updated'
),
ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id, grp
ORDER BY audit_date)
FROM marked
)
SELECT
audit_date,
lp_id,
tenant_trading_name,
suite_id,
lease_id,
building_id
FROM ranked
WHERE rnk = 2
Note. This assumes that the audit table logs only real changes, i.e. there can't be two consecutive rows with the same primary key where all four columns have identical values.