databricksdatabricks-unity-catalog

ManagedIdentityCredential in Shared Compute - Databricks


We are using some notebooks to ingest data from some other system and placing them in Storage account.

Our team recently enabled System Managed Identity on Azure Databricks and when trying to connect to Azure Storage account, the error I get it

DefaultAzureCredential failed to retrieve a token from the included credentials.
Attempted credentials:
   EnvironmentCredential: EnvironmentCredential authentication unavailable. Environment 
variables are not fully configured.
    Visit https://aka.ms/azsdk/python/identity/environmentcredential/troubleshoot to 
    troubleshoot this issue.
   ManagedIdentityCredential: ManagedIdentityCredential authentication unavailable, no response from the IMDS endpoint.
   SharedTokenCacheCredential: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
   AzureCliCredential: Azure CLI not found on path
   AzurePowerShellCredential: PowerShell is not installed
   AzureDeveloperCliCredential: Azure Developer CLI could not be found. Please visit 
    https://aka.ms/azure-dev for installation instructions and then,once installed, 
 authenticate to your Azure account using 'azd auth login'.
 To mitigate this issue, please refer to the troubleshooting guidelines here at 
 https://aka.ms/azsdk/python/identity/defaultazurecredential/troubleshoot.

Any solution on how to authenticate with Azure Storage Account in Shared Cluster Mode ? I can't use access keys because those are not recommended by enterprise team.


Solution

  • Turns out this is a documented limitation.

    I've tracked back the managed identity authorization on old (DBR 14) clusters and their public GitHub codebase. Turns out the managed identity authorization step was done by calling 169.254.169.254, which is the instance metadata service of the current Azure virtual machine. As stated here:

    You cannot connect to the instance metadata service [using Unity Catalog shared access mode]

    Solution: use single-user cluster setup (and maybe create a dedicated user for executing such workflows).

    Summary:

    Managed Identity Unity Catalog Team attach
    No isolation shared
    Shared
    Single user