[SOLVED] adf skip line count issue with csv having empty string

adf skip line count issue with csv having empty string

I am trying to copy csv files in blob storage to snowflake. first row as header and want to skip 2nd line in csv while using copy activity in azure data factory.

csv-file-rows:

ID,NAME,AGE,INCOME 
'','','',0 
123,'test',23,1045

expected: first row as header, skip second row, insert remaining rows

Error code:

DELIMITEDTEXTCOLUMNNAMENOTALLOW NULL,EXCEPTION,Message=The name of column index 1 is empty

Solution

If you know the column names before the pipeline run, you can try the below approach using copy activity.

For sample, I took the input csv data like below.

ID,NAME,AGE,INCOME
'','','',0
123,test,23,1045
124,test2,24,1046

First uncheck the First row as header in the source csv dataset.

enter image description here

Now, give this to the copy activity source and give 2 in the Skip line count option.

enter image description here

Give your target dataset as copy activity sink and import the schemas in the mapping.

enter image description here

Here, give your required column names in the order.

Now, run the copy activity and the 2nd line will be skipped in the target. For sample, I took the target as csv. In your case, it should be snowflake.

enter image description here

If you want to do this process dynamically without giving column names manually, you can use Dataflow. Give your input dataset as source in the dataflow. Make sure the First row as header of the dataset is checked in this case.

Now, add a Surrogate Key transformation and add a column key with incrementing the value as 1.

enter image description here

Then, add a filter transformation with condition to skip the first row key!=1.

enter image description here

Add your snowflake dataset as sink here and execute the dataflow from the pipeline. It will give the desired results in the target.