azureazure-pipelinesazure-data-factoryimport-from-csv

Azure Data Factory: Wrong and unexpected Datatype conversion during import from csv to sql server via pipeline


I am trying to load data from a csv to a sql server database using an Azure pipeline copy data operator. During the import data is converted to other types.

in the Source preview in the pipeline I see the following

1- the value "0044" is converted to 44

2- the value 2020000000000000 is converted to 2E+16

3- the value 5.2 is converted to February 5th

4- the value 9.78 is converted to September 1978

so far i could not find a solution for 0044,

I the other cases here is what I did:

for 2 I enclosed the number 2020000000000000 in "" then it worked, though for some reason I get it enclosed in four " like so: ""2020000000000000"" for 3 and 4 I replaced the dot for a comma and then it worked.

But I would like to be able to tell the import utility to treat everything just as string and do the conversions in the database.

how can I achive this?

the code shows following for one of the columns in 3 and 4:

(

       "source":(

              "name": "Amount"

              "type": "String"

       )

       "sink":(

              "name": "Amount"

              "type": "String"

       )

)

Best Regards,


Solution

  • my sink table in the database has only nvarchar columns. I did this so after a lot of headaches with datatypes and length.

    I decided that it is easier for me to just do the conversions from nvarchar in the database into a staging table.

    What helped in the end was to do the schema import in the source Dataset where the csv is read. There is a tab "connection" and next to it another tab "schema" where you can import the schema. After doing this it worked.