databricksazure-databricksazure-keyvault

Databricks Notebook Job use DefaultAzureCredential


I'm having an issue where I need to be admin to use dbutils.secrets.get. The notebook and job is created by service principal, so the owner and the run_as is on the service principal. Which is not admin on databricks. So when I run the notebook to get the secret from the secret scope linked to Azure Key Vault, I get an error saying I must be admin to use it.

I can not assign to an other user (me) that is admin, because the SP needs to be admin to change the run_as.

The SP in Azure linked to databricks has the RBAC role to read/write in this keyvault.

So was thinking in getting the managed identity inside the databricks notebook and use the Python SDK of Azure to get the secret from the keyvault.

from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential

key_vault_name = "my_keyvault_name"

key_vault = SecretClient(
            vault_url=f"https://{key_vault_name}.vault.azure.net",
            credential=DefaultAzureCredential(),
        )

my_secret_value = key_vault.get_secret("my_secret").value

And I get the default error of bad logging to Azure:

DefaultAzureCredential failed to retrieve a token from the included credentials.
Attempted credentials:
    EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
Visit https://aka.ms/azsdk/python/identity/environmentcredential/troubleshoot to troubleshoot this issue.
    ManagedIdentityCredential: ManagedIdentityCredential authentication unavailable, no response from the IMDS endpoint.
    SharedTokenCacheCredential: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
    AzureCliCredential: Azure CLI not found on path
    AzurePowerShellCredential: PowerShell is not installed
    AzureDeveloperCliCredential: Azure Developer CLI could not be found. Please visit https://aka.ms/azure-dev for installation instructions and then,once installed, authenticate to your Azure account using 'azd auth login'.
To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/python/identity/defaultazurecredential/troubleshoot.

Is there no way, without using the clientid + secretid in clear in the notebook to connect to Azure from Databricks?


Solution

  • Delete your created secret scope, while deploying the data Bricks workspace, a "managed" resource group is created along with a user-assigned managed identity, visible under the resources of this managed resource group as shown below:

    enter image description here

    Go to Azure key vault Access control (IAM), click on Add, select Add role assignment, select key vault Administrator role, check user, group, service principle, search for created user-assigned managed identity, select it and click on Review+assign button as shown below:

    enter image description here

    After successful role assignment, you will be able to read the secret from databricks workspace using below code:

    from azure.keyvault.secrets import SecretClient
    from azure.identity import DefaultAzureCredential
    key_vault_name = "<keyVaultName>"
    key_vault = SecretClient(
                vault_url=f"https://{key_vault_name}.vault.azure.net",
                credential=DefaultAzureCredential(),
            )
    my_secret_value = key_vault.get_secret("<secretName>").value
    print(my_secret_value)
    

    You will get the secret successfully as shown below:

    enter image description here

    If you want to use secret scope you can check this once.