I want to implement an incremental load process using SQL Server Change Data Capture. Every example I find takes the "happy path."
In other words, they assume that the CDC history exceeds the time since the last successful incremental load.
Suppose we leave the cleanup job with the default of 3 days, and for some reason our load hasn't successfully completed for longer than that. I need to check for this and run a full extract instead.
I'm logging the successful execution datetime in SQL Server tables. So, if I compare the last successful date to the earliest record in the cdc.lsn_time_mapping
table, will this accomplish my task?
Basically something like:
Select @LastSuccessfulDate from audit....
Select @MinCdCDate = min(tran_begin_time) from cdc.lsn_time_mapping
if @MinCdCDate > @LastSuccessfulDate then 'Full' else 'Incremental'
Should this work? Is there a better way to accomplish it?
I would always stay in the "log domain" not the "time domain" when working directly with CDC. So track the last LSN of the last run and compare it against sys.fn_cdc_get_min_lsn every time you syncronize.
So if you last synchronized at lsn=100, and the min_lsn=110, then you've got a gap of 10 missing log records.
But this is only one of many scenarios that will require you to reinitialize the replication with a full sync, so you should also have an input parameter or somesuch to force a full sync.