gitazure-devopsdatabricksazure-databricksdatabricks-repos

Update Databricks workspace notebook after pull request in GIT


I have GIT integrated in Azure Databricks and want to be able to have an updated notebook in my workspace every time I do a pull request into Dev for example, thereby having any update in GIT also appear in my workspace notebook. I believe there's a way to do this via kicking off a release pipeline in DevOps when this happens, as well as possibly an API being available to help with this but I'm not finding any information.

Any information or links to articles for performing this would be very helpful, thanks.


Solution

  • That's possible to automate only when you have your notebooks in the Databricks Repos - it's not possible to update notebook in the workspace via API. When you have repository checked out, you may update it to the latest state of selected branch using the Update command of Repos REST API or using the repos update command of the Databricks CLI. For pull requests you need to figure out how to get branch name as it heavily dependent on the specific Git implementation - you can find that information in Azure DevOps documentation.

    I have a working example of integrating Databricks Repos with Azure DevOps for testing of code in the notebooks and promotion between stages using Databricks CLI. Repository contains detailed instructions on how to setup everything - it's too much to put everything here.