The actual problem I'm trying to solve is that I'm using mkdocs/mkdocs-materials for my documentation. But that tool can't work with notebook type files.
So as a clumsy workaround I'm figuring is to have an intermediate step that creates a copy of the notebook content as a .py file, in the same workspace folder. Have mkdocs build off of those copies. Then delete the copies before pushing.
For example I've got a notebook type object in my workspace. Display looks like this:
%sql
select * from something
%sql
select * from something_else
def some_dummy_function():
print('dummy')
When you export a notebook as a source python file via the GUI, you get this with all the tagging for syntax.
# Databricks notebook source
# MAGIC %sql
# MAGIC select * from something
# COMMAND ----------
# MAGIC %sql
# MAGIC select * from something_else
def some_dummy_function():
print('dummy')
I want to get this programmatically, from a notebook in a workspace.
Or if you've got suggestions for the root problem at hand ... all ears.
Cobbling together using this as a reference, mainly for the base64 decode idea:
String search in all Databricks notebook in workspace level
and this handy package: https://pypi.org/project/databricks-api/
pip install databricks-api
from databricks_api import DatabricksAPI
import base64
notebook_context = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
databricks_api_instance = DatabricksAPI(
host=notebook_context.apiUrl().getOrElse(None),
token=notebook_context.apiToken().getOrElse(None)
)
response = databricks_api_instance.workspace.export_workspace(
f"/Repos/me@my_company.com/my_repo/my_notebook",
format="SOURCE",
direct_download=None,
headers=None,
)
notebook_content = base64.b64decode(response['content']).decode("utf-8")
with open("/Workspace/Repos/me@my_company.com/my_repo/new_file_name.py","w") as f:
f.write(notebook_content)