azureazure-data-factory

How to update complex json through ADF


I have a csv file like this

damNo^modelVersion2^mileageUnit^oemOrderNumber^warrantyStartDate^mchCode^saCodes
DAM123456^2A^KM^OEM987654^2005-05-12T00:00:00^MCH001^1;Versjon 1~UKJENT;Interiør ukjent~UKJENT;Farge ukjent

where this saCodes is a complex array the json file which I need from ADF is like this

{"damNo":"DAM123456","modelVersion2":"2A","mileageUnit":"KM","oemOrderNumber":"OEM987654","warrantyStartDate":"2005-05-12T00:00:00","mchCode":"MCH001","saCodes":[{"propertyId":"1","description":"Versjon 1"},{"propertyId":"UKJENT","description":"Interiør ukjent"},{"propertyId":"UKJENT","description":"Farge ukjent"}]}

The main purpose is that I need to update cosmos so I am getting a csv file then I am creating the json and updating the cosmos

Is there a way to do it through ADF


Solution

  • You can use ADF dataflow to achieve your requirement.

    First take your csv dataset and give the @string('^') expression as Column delimiter.

    enter image description here

    Now, take this dataset as a source in the ADF dataflow and import the projection. It will import all the columns as shown below.

    enter image description here

    Now, take a derived column and create a new column with the below expression.

    map(split(saCodes,'~'),@(propertyId=split(#item,';')[1],description=split(#item,';')[2]))
    

    enter image description here

    This will create the complex array column with different name. The reason for giving different name to this column is, if we give the same column saCodes, it expects the return value of string type from the above expression.

    enter image description here

    Here, I took two rows for sample in the csv file. It will generate the JSON for all rows from the source csv file.

    Now, take a select transformation and remove the string saCodes column and rename the new array column as saCodes.

    enter image description here

    In the sink, take JSON dataset, and to generate a single JSON file from the dataflow, follow the below configurations.

    enter image description here

    Now, run the dataflow from the pipeline and this will generate the desired JSON file in the target as shown below. Here, as I took multiple rows in the source csv file, it generated multiple objects in the target JSON file.

    enter image description here

    You can copy this JSON file to your cosmos target.