azure-data-factorycopy-data

Access the source data in pre-copy script in copy data activity


I am trying to access the some data stored in container in blob storage in copy data activity. In the source settings I am able to specify the dataset for the source file , now I want to access the same file in the sink settings in the pre-copy script to make some transformations on it.

https://i.sstatic.net/JpKmgMU2.png

https://i.sstatic.net/V0FxXaEt.png

In the next image , I want to fill in the space in quotes, what is the value to access the source in copy data.


Solution

  • Copy activity pre-copy script is only for the sink dataset database. Whatever the queries that you run in this, will be like a script in the target database. It means, the pre-copy script won't have access to the source data of the copy activity. The pre-copy script is for truncation or updating of certain tables in the target database before the copy to the target table.

    As your requirement is to modify the source data before the copy to target table, you can use Dataflow to do the transformations. In the source of the dataflow, give your source dataset and you can use Select, aggregate, derived column transformations as per your SQL query. In the sink, add your target SQL table.

    enter image description here

    (OR)

    If your source data is less than 5000 rows, you can try below approach using the combination of lookup and copy activity.

    Use lookup activity with source dataset and it will give the data as JSON array.

    enter image description here

    After lookup use copy activity but for both source and sink datasets, give the same target table dataset.

    In the source of the copy activity, use query option. Use openjson() with lookup activity output JSON array like below sample in the query.

    declare @json nvarchar(max) = N'@{activity('Lookup1').output.value}';
    
    SELECT * FROM  
     OPENJSON (@json)  
    WITH (   
                  id int '$.id' ,  
                  date varchar(32) '$.date',  
                  name int '$.name'
     );
    

    Now, you can modify the source data using your SQL query like above which will be copied to the target table.