azure-functionsazure-logic-appsoracle-adfazure-eventgrideventhub

Processing Event Hub XML file Data using ADF to store on SQL table


We have 2 systems 1 system send the data in XMl file and store into Azure Event hub, I need to process XML file data & store it into a SQL Server table.

Multiple files are sent in event hub in 1 seconds.

Currently we had tried with

  1. Azure Function with stored procedure to do that, but it has performance issue for failed cases.
  2. Trying to implement azure hub with event grid with logic app & ADF but could not succeed.

Could anyone suggest some best approach and links?


Solution

  • You can use ADF in this scenario. First, send your Event hub data to either ADLS gen2 or Blob storage. Then, you can make use of ADF to achieve your requirement.

    To send the Event hub data to storage account, you need to enable Capture in the Event hub.

    enter image description here

    Give time window and slice window as per your requirement and give a container for the data. For sample, I have chosen Avro file format, you can choose Parquet as well depending upon your requirement.

    After this, you need to use Mapping Dataflow in the ADF pipelines. In the source create Gen2 Avro dataset and in the sink create SQL database dataset.

    You can schedule this dataflow via pipeline using schedule trigger so that it copied all the latest data to your target.

    For incremental loading and to prevent already loaded files, you can use Move option in the Dataflow source to move the source data to temporary storage in every pipeline run.

    enter image description here

    Or if you want to delete those after the loading, you can choose Delete source files option.

    In the sink, use Upsert option to avoid the duplicate data.

    On every pipeline run, the input data will be copied to your target and the loaded files in the storage account will be deleted or move to temporary location.

    You can go through this Blog by @Inder Rana to know more about this approach.