I am comparing the title in movies.csv, which matches the title element in the json files, and then appending the filename to list.txt in the sink. Assume "Big Hero" and "Moana" are match in file1.json and file5.json. However, ADF returns both entire file1.json and file5.json to the sink.
I only want to see the filename "file1.json" and "file5.json" in list.txt.
title.csv:
the final result in list.txt (sink):
file1.json
file5.json
ADF dataflow setup:
source option>column to store file name>list
derivedColumn1>Columns>Title and expression lower(trim(record.metadata.Title))
lookup1>primary stream> derivedColumn1
lookup1>lookup stream> sourceCSV
filter1>filter on> isNull(sourceCSV@title) == false()
sink>column data> list
I only want to see the filename "file1.json" and "file5.json" in list.txt.
After getting the required rows, you can use select transformation to get only the required filename
column.
Before the lookup, ensure the rows in your source csv file are changed to lower case using a derived column transformation.
Now, after using filter transformation, use select transformation with Rule-based mapping as shown below.
name=='filename'
Now, add your text file as sink in the dataflow and run the dataflow from the pipeline.
It will generate the target text file as expected.