I'm using Data Fusion to replicate tables from MysQL production databases into BigQuery. It works well and the latency is low, which is nice.
On top of replicating all selected tables, Data Fusion adds a few metadata columns in each replicated table: _sequence_num
, _source_timestamp
, _row_id
and, if the option was selected during job creation, _is_deleted
.
While _is_deleted
and _sequence_num
columns are filled correctly, the _source_timestamp
column always remains empty. Did I miss something during the job configuration or is it the expected behavior?
I was not able to find anything relevant in the documentation.
Thanks in advance!
I tried multiple jobs configuration but none led to filled metadata columns.
The column _source_timestamp
is only captured for CDC streams that provide unordered events, according to the documentation. Oracle is the only source plugin that generates unordered CDC streams today. The column _source_timestamp
will not be populated for MySQL databases.Although the columns are present, they will be blank in the bigquery.
An open issue has been created for this.