azure-sql-databaseazure-data-factory

Sink not overwriting table with specified data types


I have a sink in a data flow pointed to AzureSQL. There is a set of columns that were at one time booleans, which are now strings. I cannot for the life of me get ADF to change the data type in the output table.

I have tried:

  1. Allow Schema Drift on the Sink block->Sink menu
  2. Set Table action to "Recreate table" in the Sink block->Settings menu
  3. Checked "Auto mapping" in the Sink block->Mapping menu
  4. Un-checking "Auto mapping" and resetting the schema in the Sink block->Mapping menu
  5. Putting a Select block, then a Cast block, ahead of the sink in the data flow
  6. In desperation, I tried dropping the table in the database and forcing it to recreate. Now the pipeline executes, but the table does not appear.
  7. Given the vendor, I tried turning it off and on again.

I also get this funny message at the top of the Sink block->Mapping menu, even given the above:

At least one incoming column is mapped to a column in the sink dataset schema with a conflicting type, which can cause NULL values or runtime errors. So it obviously knows there's a difference, but everywhere else I've used this sink pattern, ADF will just go "New schema, here ya go" when I drift it in the flow.

What am I doing wrong?


Solution

  • enter image description here

    The above warning arises when the sink dataset schema is not aligned with the incoming data schema. In this case, the source data types are strings and sink table data types are booleans.

    This is my sink table schema before the dataflow run.

    enter image description here

    To change the sink table data types as per the incoming data, first you need to clear the sink table's schema in the sink dataset. If you keep the old schema here, even if you recreated the table from the dataflow, it would take this schema only. This also avoids the above warning in the dataflow mapping.

    enter image description here

    Now, go to the dataflow sink and enable the Allow schema drift and Recreate table option in the sink settings as well. Change your source data data types in the projection or use any transformations in between as per your requirement.

    enter image description here

    Now, run the dataflow, it will drop the old table and recreates the new table with the required schema as expected.

    enter image description here