azureazure-data-factorypartitioninghana

Partition 0CALDAY (yyyymmdd) column to year, month and day in a copy activity of Azure Data Factory


I am using SAP HANA connector to copy the data from SAP HANA to Azure data lake. In my query there is a column called 0CALDAY which is a string data type and has row values like 20230119. I would like to use it as a partition column. Simply putting partition column name in the dynamic range option works fine but what i would like to achieve is the partitions at year, month and day level in this case 2023/01/19.

I tried this

  1. @concat('/year=', formatDateTime(utcnow(), 'yyyy'), '/month=', formatDateTime(utcnow(), 'MM'), '/day=', formatDateTime(utcnow(), 'dd')) : [LIBODBCHDB DLL][HDBODBC] General error;260 invalid column name: /year=2023/month=09/day=14
  2. Replacing utcnow() with the 0CALDAY column but it is also not working rather throwing an error of invalid column name.

This link provides some basic information about partitioning but is also not helping to achieve my goal.

I will appreciate any help on this one! Thanks


Solution

  • Partition option in source settings of copy activity is used for parallel copying of data from source to sink. It does not partition the sink datastore. When huge data needs to be copied from source to sink, you can use partition option in source. This will enhance the performance of copying data. In order to partition data in sink, you can use dataflow activity instead of copy activity. But in dataflow activity, SAPHANA as a source is not available. Therefore, you can stage the full data from SAP HANA to blob storage using copy activity and then use dataflow activity to partition data in sink. In order to do this in dataflow, follow the below steps.

    enter image description here

    enter image description here