azureazure-data-factoryazure-storageazure-data-lakeazure-storage-files

Azure Data Factory Get Filepaths of nested json objects


I have the following problem in Azure Data Factory:

container/../../year=2023/m=1/d=01/h=01/m=5/file.json

There is a json file for every 5 minutes, every hour, every day, every month in the year 2023. I want to get a list of all the filepaths. So that I can loop through them and copy the json data.

How can I do that?

I tried the get Metadata activity. However, this does not work recursively.


Solution

  • As get metadata does not support the recursive iteration.

    An alternative to attempting a direct recursive traversal is to take an iterative approach, using a queue implemented in ADF as an Array variable. You can follow above documentation to create this iterative approach.

    @Richard Swinbank here Get Metadata recursively in Azure Data Factory document discussed the Workaround for same situation.

    I want to get a list of all the filepaths. So that I can loop through them and copy the json data.

    To copy data using dataflow or copy activity you can just use wild cards to copy files recursively.

    enter image description here