databricksdatabricks-repos

How do I access Databricks Repos metadata?


Is there a way to access data such as Repo url and Branch name inside a notebook within a Repo? Perhaps something in dbutils.


Solution

  • You can use Repos API for that - specifically the Get command. You can extract notebook path from the notebook context available via dbutils, and then do the two queries:

    1. Get repo ID by path via Workspace API (repo path always consists of 3 components - /Repos, directory (for user or custom), and actual repository name)
    2. Fetch repo data

    Something like this:

    import json
    import requests
    
    ctx = json.loads(
      dbutils.notebook.entry_point.getDbutils().notebook().getContext().toJson())
    
    notebook_path = ctx['extraContext']['notebook_path']
    repo_path = '/'.join(notebook_path.split('/')[:4])
    api_url = ctx['extraContext']['api_url']
    api_token = "your_PAT_token"
    
    repo_dir_data = requests.get(f"{api_url}/api/2.0/workspace/get-status",  
                                 headers = {"Authorization": f"Bearer {api_token}"},
                                 json={"path": repo_path}).json()
    repo_id = repo_dir_data['object_id']
    repo_data = requests.get(f"{api_url}/api/2.0/repos/{repo_id}",  
                             headers = {"Authorization": f"Bearer {api_token}"}
                            ).json()