sqldatabasesql-server-2008t-sqlaudit-tables

Query help when using audit table


Assuming I have two tables, one with the following columns called lease_period:

tenant_trading_name, suite_id, lease_id, building_id

and another, called lease_period_audit with the following:

audit_date, audit_type, tenant_trading_name, suite_id, lease_id, building_id

Each time a record is updated on lease_period and entry is made on lease_period_audit with a status of 'Updated'. I'm trying to find all updates made only to the tenant_trading_name field but haven't had any success. So far I have the following:

select              lpa.*
from                property.lease_period_audit lpa
inner join          property.lease_period lp on lpa.suite_id = lp.suite_id and lpa.lease_id = lp.lease_id and lpa.building_id = lp.building_id
where               audit_type = 'Updated'
                    and lp.tenant_trading_name <> lpa.tenant_trading_name
order by            1 desc  

Where's the flaw in my thought process here? How can this be done / how should I be thinking about this?


Solution

  • Assuming the audit table also logs the lease_period primary key column, referenced here lp_id for simplicity, you could try the following approach:

    1. Find all the rows where audit_type is 'Updated'.

    2. Rank all the rows by audit_date and partitioning them by lp_id.

    3. Rank the rows by audit_date partitioning by lp_id, suite_id, lease_id, building_id.

    4. Get the difference between the two rankings.

    5. Rank the rows again by audit_date, partitioning them now by lp_id, suite_id, lease_id, building_id, (ranking_difference).

    6. Output all the rows where the last ranking value is 2 or greater.

    The first four steps result in a row set where each group of consecutive (in ascending order of audit_date) rows with identical values of suite_id, lease_id, building_id for the same lp_id will be uniquely distinguished by a value calculated as the difference between the rankings #2 & #3.

    Within the group, every row, starting from the second one, will differ from the previous one only in the value of tenant_trading_name, which is just what we need. So, we rank the rows once again, taking into account the 'group ID' we've just obtained, then return every row with the ranking of 2 or higher.

    Here's an approximate implementation:

    WITH marked AS (
      SELECT
        *,
        grp = ROW_NUMBER() OVER (PARTITION BY lp_id
                                     ORDER BY audit_date)
            - ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id
                                     ORDER BY audit_date)
      FROM lease_period_audit
      WHERE audit_type = 'Updated'
    ),
    ranked AS (
      SELECT
        *,
        rnk = ROW_NUMBER() OVER (PARTITION BY lp_id, suite_id, lease_id, building_id, grp
                                     ORDER BY audit_date)
      FROM marked
    )
    SELECT
      audit_date,
      lp_id,
      tenant_trading_name,
      suite_id,
      lease_id,
      building_id
    FROM ranked
    WHERE rnk = 2
    

    Note. This assumes that the audit table logs only real changes, i.e. there can't be two consecutive rows with the same primary key where all four columns have identical values.