I would like to use an event based trigger to run a data factory pipeline.
The trigger will check a folder in a data lake for any new file and start a pipeline once a new CSV file is copied.
The pipeline will then copy the data to an intermediate table to check its consistency (multiple checks using different data flow activities) and if everything's correct, copies it into a stage table.
It is thus very important that the intermediate table will contain the data from only one single CSV file before it is checked.
I have read though that the event based trigger will start in parallel as many pipelines as the (simultaneously) downloaded CSV files.
Is this right? in this case how can I force each Pipeline to wait until the previous one is done?
Thank you for your help.
There is a flag on the pipeline properties (accssible in the top-right of the editor pane) called concurrency
. Set this to 1 and only one copy will run and any other invocations will be queued until that one finishes.