azure-data-factory

update field value in dataflow ADF


How can I update the field "Family.age" from 27 to 40 in the json file.? I tried to derive the column, but it didn't update the value. I'm a new to ADF and not sure the config is corrected.

 {
"Family": {
    "first_name": "John",
    "last_name": "Smith",
    "is_alive": true,
    "age": 27,
    "address": {
        "street_address": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postal_code": "10021-3100"
    },
    "phone_numbers": [
        {
            "type": "home",
            "number": "212 555-1234"
        },
        {
            "type": "office",
            "number": "646 555-4567"
        }
    ],
    "children": [
        "Catherine",
        "Thomas",
        "Trevor"
    ]
 }
}   

This expression ensures that all elements in the Family object are retained, and only the age field is updated to 40 if it was 27. However, the result age value remain the same.

Column: age, and Expression: iif((Family.age == 27, 40, Family.age)

enter image description here


Solution

  • I have tried to update the inner JSON value using select and derived columns. First, I have extracted the inner age column and updated to required value. Then, I have used the derived column transformation to generate the required JSON. But I am unable to do it as it requires to mention all column names manually and also needed type casting.

    So, I have tried the below approach to achieve your requirement. This approach only works when you need the output in a JSON file.

    First, use a derived column transformation and create a string column str with the below expression. You can also use replace on this JSON string to replace your Family.last_name value with concat(Family.last_name,'.json').

    concat('{"Family":',replace(toString(Family),'"age":27','"age":40'),'}')
    

    enter image description here

    The above expression converts the Family object into a string and replaces the required key and value pair with the updated value and concatenates with the outer key name Family and generates the requires JSON as string as shown below.

    enter image description here

    Now, use a select transformation to remove the extra column Family. In the sink, use a csv dataset to generate the required JSON. In the csv dataset, give the below configurations.

    enter image description here

    In the dataflow sink, set the output to a single file and give the <filename>.json like below.

    enter image description here

    Run the dataflow from the pipeline and the required JSON file will be generated in the sink location.

    enter image description here

    When you want to use this JSON file, you need to create a JSON dataset for this as per your requirement.