Using Databricks Repos, you can add a git repo to Databricks and execute git actions such as git pull
. This is done by clicking on the branch name in the top left, and clicking the button saying "Pull".
I would like to do this without clicking on things in my browser.
I would assume that both are possible (this answer implies so), but providing just one would be sufficient to answer my question.
One might wonder what I expect to happen if a pull is non-trivial, eg. the branches have diverged or "your unstaged changes would be wiped out by pulling...". Simply erroring out would be sufficient in this case. I intend to ensure that it will never happen through other mechanisms.
For databricks-cli it's the databricks repos update
command:
>databricks repos update -h
Usage: databricks repos update [OPTIONS]
Checks out the repo to the given branch or tag. This call returns an error
if the branch or tag doesn't exist.
Options:
--repo-id TEXT Repo ID
--path TEXT Workspace path of the repo object
--branch TEXT Branch name
--tag TEXT Tag name
it will checkout branch even if repo is on the given branch:
databricks repos update --path /Repos/.... --branch releases
You can find the working demo of it in the following repository that shows integration of Repos with Azure DevOps.
For REST API, there is the corresponding endpoint. The only difference from CLI is that it accepts only Repository ID, not the path, but you can find Repos ID from path via Get Status endpoint of Workspace API. You can find an example in the history of the same demo repository (please note that Repos API could change since that time)