dynamicazure-data-factoryjsonschemajson-flattenerunroll

Can the Flatten activity in ADF data flow handle dynamic unroll by


I'm importing items from Cosmos db, however each item can potentially have a different schema structure. Below are a few examples of how they could look.

Ex.1

{
  "id": "01-000008",
  "pKey": "000008",
  "delete": false,
  "/": {
    "fwqs": false,
    "itemDescription": "Test Description one",
    "linkset": {
      "anchor": "Test anchor one",
      "itemDescription": "Test Description one",
      "https://test/testIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to test one",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ],
      "https://test/activityIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to buy",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ]
    }
  }
}

Ex.2

{
  "id": "01-000009",
  "pKey": "000009",
  "delete": false,
  "/10/101": {
    "fwqs": false,
    "itemDescription": "Test Description Two",
    "linkset": {
      "anchor": "Test anchor Two",
      "itemDescription": "Test Description Two",
      "https://test/testIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to test",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ]
    }
  }
}

Ex.3

{
  "id": "01-0000010",
  "pKey": "0000010",
  "delete": false,
  "/10/12345/AB/100001": {
    "fwqs": false,
    "itemDescription": "Test Description Three",
    "linkset": {
      "anchor": "Test anchor Three",
      "itemDescription": "Test Description Three",
      "https://test/testIdeas": [
        {
          "href": "https://test.com/test/",
          "title": "Where to test",
          "fwqs": true,
          "public": true,
          "type": "text/html",
          "context": [
            "us"
          ],
          "hreflang": [
            "en"
          ]
        }
      ]
    }
  }
}

I've made sure to clear the data set schema and the source projection in preparation to handle the dynamic key.

In Ex.1 in the unroll by I'm trying to access {/}.linkset.{https://test/testIdeas}. But because the "{/}" key is always changing I need to make this part dynamic somehow in order to also pick up the keys you find in Ex.2 & Ex.3.

I'm struggling to resolve the issue, I'm not sure what function should be used to identify a dynamic key and if it should be done in the "add dynamic content" or the "add column pattern" part of the "unroll by" option in the flatten activity.

desired results current results

flatten activity ideas

flatten test ideas

derived 'new_col' activity


Solution

  • One approach to solve this is renaming the dynamic key with a static name and then flattening that column. Since the key has the pattern starting with \, column which begins with \ can be renamed as new_column. Below are the detailed steps to rename the dynamic key.

    enter image description here

    Dataflow script for derived column:

    derive(each(match(startsWith(name,'/')), 'new_col'  =  $$)) ~> derivedColumn1
    

    enter image description here

    By this way, flatten the Json with dynamic key can be flattened.