I am trying to push the Databricks notebooks from Azure Devops to Databricks workspace using a Devops pipeline. Below is the code I am using.
variables:
databricksWorkspaceUrl: ''
databricksPAT: 456t7788
trigger:
- development
pool:
vmImage: 'ubuntu-latest'
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.x'
addToPath: true
- checkout: self
- script: |
echo "Starting Databricks notebook upload..."
# Install Databricks CLI
pip install databricks-cli
# Authenticate with Databricks using the PAT
echo "Authenticating with Databricks..."
databricks configure --token
echo "$(databricksPAT)" | databricks tokens create --comment "Databricks PAT" --output json
# Upload notebooks to Databricks workspace
echo "Uploading notebooks to Databricks..."
databricks workspace import $(Build.SourcesDirectory)/notebookpathindevops
databricks workspace import $(Build.SourcesDirectory)/notebookpathindevops
echo "Notebooks uploaded successfully."
# Trigger Databricks pipeline
echo "Triggering Databricks pipeline..."
databricks runs submit --json '{
"run_name": "My Notebook Run",
"new_cluster": {
"spark_version": "7.3.x"
},
"notebook_task": {
"notebook_path": "/Folderpathinworkspace"
}
}' --url $(databricksWorkspaceUrl)
echo "Databricks pipeline triggered."
displayName: 'Upload Notebooks to Databricks and Trigger Databricks Pipeline'
env:
databricksPAT: $(databricksPAT) # Use the variable defined in Azure DevOps
The pipeline is getting stuck at the authenticate with Databricks using the PAT step. Is there anything more I need to add to my code.
Thank you
Normally, when executing the databricks configure
command to create a configuration profile, it will prompt you to enter your Azure Databricks PAT. See "Azure Databricks personal access token authentication".
When running in pipeline, since it is non-interactive, the session will hang on waiting for you to manually enter the PAT until get time-out.
To resolve this issue, you can try below lines in your script.
databricks configure --token <<EOF
$databricksPAT
EOF