azure-data-factoryazure-pipelines-build-taskdata-mapping

Is it possible to do row reference in ADF data mapping flow


We received a dataset with data broken out in tranches by category, and would like to put the tranche breaker "Department" into its own column, so the output should be "department", "employee code", "salary" and remove the extra rows at the bottom. Is there anyway to achieve this? enter image description here

Sample output

enter image description here


Solution

  • In order to create a new column called department and fill the data in the column, you can follow the below steps.

    1. source transformation is taken for input data.

    enter image description here

    1. derive transformation is taken to add a new column called "Department". The expression for this column is given as iif(isNull(toInteger(EmployeeCode)),EmployeeCode,toString(null())). Another column named dummy is added and value is given as 1.

    enter image description here

    1. Surrogate Key transformation is added to generate a surrogate key for each row in the data flow.

    enter image description here

    enter image description here

    1. window tansformation is taken to fill the missing values in the "Department" column using the last function.

    gif1

    1. filter transformation is taken to filter out the rows where the "Salary" column is null. and then select transformation is taken to select the "EmployeeCode", "Salary", and "Department" columns.

    Reference: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-script#fill-down