I have a folder with python files and Databricks notebooks. Currently, I use a CI/CD Pipeline which uses REST calls to clone those files to a Databricks Workspace. I was considering to replace that with databricks asset bundles to get rid of my native implementation as the docu states:
Bundles make it possible to describe Databricks resources such as jobs, pipelines, and notebooks as source files
Assume, I have a folder in git that looks like the following:
resources
- *.py
databricks.yml
I simply want a asset bundle that takes everything of the resource folder and copies it to a folder in my databricks workspace.
I tried the following setup in databricks.yml
include:
- resources/*
targets:
dev:
workspace:
host: https://adb***.net
root_path: ~/DATABRICKS_BUNDLES
notebooks_path: /Workspace/test
However, I always get an error
Error: failed to load /agent/_work/1/s/xxxx/notebook.py: :0:0: expected a map, found a invalid
When I remove the *.py
from resources, it runs but nothing is copied.
According to the documentation, the include
is only for configuration files, hence only .yml
files. The error you are having is because the CLI is concatening all the files specified in the include
paths and since you specified python files in it, it fails.
Databricks Asset Bundles relies heavily on the .gitignore
to sync the folders and files between local and workspace. If your files are not ignored by the .gitignore, you should be able to find them on your root_path
.
By default it should be something like :
Workspace > Users > your_username > .bundle > bundle_name > target_name