Databricks job used to connect to ADLS G2 storage and process the files successfully.
Recently after renewing the Service Principal secrets, and updating the secret in Key-vault, now the jobs are failing.
using the databricks-cli databricks secrets list-scopes --profile mycluster
, i was able to identify which key valut is being used, Also verified the corresponding secrets are updated correctly.
Within the notebook, i followed link and was able to access the ALDS
Below i used to test the key vault values, to access the ADLS.
scopename="name-of-the-scope-used-in-databricks-workspace"
appId=dbutils.secrets.get(scope=scopename,key="name-of-the-key-from-keyvault-referring-appid")
directoryId=dbutils.secrets.get(scope=scopename,key="name-of-key-from-keyvault-referring-TenantId")
secretValue=dbutils.secrets.get(scope=scopename,key="name-of-key-from-keyvaut-referring-Secretkey")
storageAccount="ADLS-Gen2-StorageAccountName"
spark.conf.set(f"fs.azure.account.auth.type.{storageAccount}.dfs.core.windows.net", "OAuth")
spark.conf.set(f"fs.azure.account.oauth.provider.type.{storageAccount}.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set(f"fs.azure.account.oauth2.client.id.{storageAccount}.dfs.core.windows.net", appid)
spark.conf.set(f"fs.azure.account.oauth2.client.secret.{storageAccount}.dfs.core.windows.net", secretValue)
spark.conf.set(f"fs.azure.account.oauth2.client.endpoint.{storageAccount}.dfs.core.windows.net", f"https://login.microsoftonline.com/{directoryid}/oauth2/token")
dbutils.fs.ls("abfss://<container-name>@<storage-accnt-name>.dfs.core.windows.net/<folder>")
With an attached cluster, above successfully display the list of folders/files within the ADLS G2 storage.
The code used to create the mount point, which used old secrets info.
scope_name="name-of-the-scope-from-workspace"
directoryId=dbutils.secrets.get(scope=scope_name, key="name-of-key-from-keyvault-which-stores-tenantid-value")
configs = {"fs.azure.account.auth.type": "OAuth",
"fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id": dbutils.secrets.get(scope=scope_name, key="name-of-key-from-key-vault-referring-to-clientid"),
"fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope=scope_name, key="name-of-key-from-key-vault-referring-to-secretvalue-generated-in-sp-secrets"),
"fs.azure.account.oauth2.client.endpoint": f"https://login.microsoftonline.com/{directoryId}/oauth2/token"}
storage_acct_name="storageaccountname"
container_name="name-of-container"
mount_point = "/mnt/appadls/content"
if not any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()):
print(f"Mounting {mount_point} to DBFS filesystem")
dbutils.fs.mount(
source = f"abfss://{container_name}@{storage_acct_name}.dfs.core.windows.net/",
mount_point = mount_point,
extra_configs = configs)
else:
print("Mount point {mount_point} has already been mounted.")
In my case the key vault is updated with clientid, tenant/directory id, SP secret key.
After renewing the service prinicpal, when accessing the /mnt/path, I see below exception.
...
response '{"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret is provided.
The only thing i could think of is the mount point was created with old secrets as in the above code. After renewing the service principal do i need to unmount and re-create the mount point?
So i finally tried to unmount and mount the ADLS G2 storage, now i am able to access that.
I didn't expect that the configuration would somehow be persisted. just updating the service principal secret is sufficient.