azuredatabricksmlops

Databricks DBX and Asset Bundles: Support for Storing config files in Container/Storage Account


I'm trying to deploy a Databricks workflow that is configured using yaml files. Currently I'm using dbx. Is there any way that, instead of using the YAML files within my project locally that is then uploaded with the dbx deploy command, to point the workflow to a project-specific conf folder mirroring the expected layout stored somewhere else? (s3 bucket, Azure Container, etc)

Using dbx version 0.18

I tried manually finding the path for the YAML file uploads, as well as digging through all of the documentation.


Solution

  • I tried the dbx CLI to deploy a deployment.yml file which is in an Azure Blob storage. To quickly test it, I only generated a SAS token for the deployment file but dbx was unable to find the file.

    Therefore, the workaround was to use azcopy that downloads the remote deployment file from the cloud to local storage and then executes the dbx deploy command.

    azcopy copy "https://<account>.blob.core.windows.net/<container>/<path/to/blob>?<SAS>" "[path/to/file]deployment.yml"
    dbx deploy --deployment-file <path/to/file>deployment.yml"
    

    Note: First, you need to download the AzCopy executable if you haven't already.

    If you would like to deploy a file that is residing under a git repository, you will simple clone the repo and deploy the file:

    git clone https://github.com/<username>/<repository>
    dbx deploy --deployment-file <path/to/file>deployment.yml"
    

    If it is a raw file URL, you can use the curl or wget command:

    curl --output deployment.yml https://your-domain-name/some_folder/your_deployment.yml
    dbx deploy --deployment-file <path/to/file>deployment.yml"