In Azure Data Factory, I have a dynamic pipeline that extracts data from our SAP systems. The metadata for each given table (which are around 2000) is extracted via a lookup from which I get the name, ODP context, key columns and connection related details, like so:
Afterwards, I have a data flow that extracts the data and sinks it to a .parquet file:
I can figure out how to assign most of the parameters dynamically, but I struggle to figure out how to make a selection only of the columns given in my metadata (KeyColumns) and not the entire object. I am running only full loads.
I have tried to run it as Full load then incremental, where I parsed the key columns to the Key Columns parameter, but not surprising, the whole object was loaded since this parameter indicates columns used for de-duplication.
To make a selection only of the columns given in the metadata and not the entire object, you need to use select transformation. Giving the expression in Key columns
, does not help you in selecting the desired columns dynamically. Key columns
are used for incremental data loading. Below is the way to do using select transformation
.
Click +Add mapping and select Rule based mapping
Give the expression in source column of mapping as in($<array_parameter_name>,name)
.
Here replace <array_parameter_name> with parameter name which contains the array of column names. name
is the keyword which is used here to check if any of the column names matches with value in the array.