azure-data-factory

ADF Dataflow - How to use expression for not equal elements in the array


I have tried using the FILTER function to exclude 'CC' or 'AA' values in the array. I use ADF dataflow but it doesn't work. I don't know the correct way to exclude values in the array.

filter(record.metadata.markings,#item.code != 'CC'|| #item.code != 'AA')

Solution

  • In your expression, it won't give the expected results when the code value is AA. On the first condition check itself, it will give true value and won't check for the value AA.

    And you cannot directly modify the array which is a nested object. You can try an approach like below to achieve your requirement.

    Sample input JSON data:

    {
      "record": {
         "ns": "meta",
         "type": "SW",
           "metadata": {
              "markings":[
                  {
                      "code":"SW",
                      "name":"Rakesh"
                  },
                  {
                      "code":"AA",
                      "name":"Luffy"
                  },
                  {
                      "code":"CC",
                      "name":"Shanks"
                  },
                  {
                      "code":"SAP",
                      "name":"Raj"
                  },
                  {
                      "code":"CC",
                      "name":"John"
                  },
                  {
                      "code":"DD",
                      "name":"AR"
                  }
              ]
           }
      }
    }
    

    First take a derived column transformation using below expression. This expression filters the markings array and assign the result array values to another column.

    filter(record.metadata.markings,#item.code != 'CC'&& #item.code != 'AA')
    

    enter image description here

    Now, take another derived column transformation and create another column json_str with below expression.

    concat('{"record":',iif(hasPath('record.metadata.markings'),replace(toString(record),toString(record.metadata.markings),toString(new_markings)),toString(record)),'}')
    

    enter image description here

    This will convert the JSON object to a string and will replace the old markings array with new markings column. Now, a new JSON string column will be created which will have the required JSON object.

    enter image description here

    Now, use select transformation to select only the json_str column. In the sink, take a Delimited text dataset with below configurations. These configurations will ensure the generation of .json file with required JSON object content.

    enter image description here

    In the sink, give the .json file name and make sure it is stored in a single file name.

    enter image description here

    Run the dataflow from pipeline and the target JSON file with required JSON content will be generated as shown below.

    enter image description here