databricksinformatica-cloudaws-databricksinformatica-data-integration-hub

Can we execute a Databricks Notebook from Informatica


There is an Informatica workflow which consists of multiple source systems. We are trying to migrate one of Sources to Databricks and would like to execute the databricks job from Informatica. Something similar to what we can do using Airflow but since existing workflows are in Informatica we would like to keep Informatica workflows except for ones which are converted to databricks. Accessing delta tables is possible but in prior version most of transformations are written as SQL Qualifiers hence we would like to convert them to Spark SQL in databricks notebooks and then execute these notebooks via Informatica.


Solution

  • Azure is based upon web service or REST API calls. This includes anything that is in Azure Databricks. Here is a link on how to execute a job.

    https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunNow

    Jobs are now called workflows in which one or more notebooks can be executed with precedence conditions between the calls (notebooks).

    Check out this blog on how to call a web service using REST from Informatica.

    https://blogs.perficient.com/2017/01/17/web-services-communication-using-informatica/