azure-data-factory

ADF dataflow contain file names in single file sink


I am comparing the title in movies.csv, which matches the title element in the json files, and then appending the filename to list.txt in the sink. Assume "Big Hero" and "Moana" are match in file1.json and file5.json. However, ADF returns both entire file1.json and file5.json to the sink.

I only want to see the filename "file1.json" and "file5.json" in list.txt.

title.csv:

the final result in list.txt (sink):

enter image description here


Solution

  • I only want to see the filename "file1.json" and "file5.json" in list.txt.

    After getting the required rows, you can use select transformation to get only the required filename column.

    Before the lookup, ensure the rows in your source csv file are changed to lower case using a derived column transformation.

    enter image description here

    Now, after using filter transformation, use select transformation with Rule-based mapping as shown below.

    name=='filename'
    

    enter image description here

    Now, add your text file as sink in the dataflow and run the dataflow from the pipeline.

    It will generate the target text file as expected.

    enter image description here