azureazure-data-factory

Azure Data Factory (ADF), create a csv dataset that has dynamic file path


I want to create a csv dataset that will pick a file with name that follows a pattern like file_name_ddmmyyyy: file_name_01012023.csv or file_name_01012024

It d be enough to check if the file name starts with file_name_ enter image description here

I want to extract the metadata of this file and proceed with further validations. Later on, I d like to create a trigger which will trigger the pipeline whenever a file with this filename will arrive in the blob storage container.

I can imagine that the question is: what adf is supposed to do when more than one files having this format will do. What I can say for now is that, for now, I ll make sure that only one file with this file name pattern will be there


Solution

  • You can use storage event trigger in ADF for this requirement.

    First create two string type parameters in the pipeline parameters section and don't give any default values to those.

    Next Go to triggers -> new and create trigger like below.

    enter image description here

    You can check all containers if you want.

    Then, pass the trigger parameters @triggerBody().folderPath and @triggerBody().fileName to the created pipeline parameters.

    enter image description here

    This trigger will trigger the pipeline whenever a file with the pattern of file_name*.csv is uploaded or modified.

    To make sure the triggered file is of correct pattern, use if activity in the pipeline with below expression.

    @and(equals(length(split(split(pipeline().parameters.filename,'_')[2],'.')[0]),8),contains(split(split(pipeline().parameters.filename,'_')[2],'.')[0],'20'))
    

    enter image description here

    Inside True activities of If, use your activities to use the file. To get the file path dynamically in the dataset, you can use dataset parameters. In the dataset, create a string parameter in the parameters section without any value.

    In the dynamic expression of file name use that parameter @dataset().<parameter_name>.

    enter image description here

    In your activity, when you use dataset, it will ask to provide value to this parameter. Then you can give below expression to that.

    @concat(pipeline().parameters.folderpath,'/',pipeline().parameters.filename)
    

    enter image description here

    For sample, I have used lookup, you can use copy activity or other as per your requirement.