I know how to write from databricks using storage account access key.
spark.conf.set(
"fs.azure.account.key.MyStorageAccount.blob.core.windows.net",
"XxXxXxXxXxXxXxXxXxXxXxXxXxXxXx")
df = spark.createDataFrame([(1, "foo")],["id", "label"])
df.write.format("delta").save("wasbs://MyContainer@MyStorageAccount.blob.core.windows.net/HERE")
Now I want to do the same using Servie Principal.
I found this code to generate a valid SP token:
client_id = "AXAXAXAX"
secret_id = "YNTHBRGEZFGTYUI"
storage_account_url = "https://MyStorageAccount.blob.core.windows.net/"
token_credential = ClientSecretCredential(TENANT_ID, CLIENT_ID, CLIENT_SECRET)
How can I pass it to my spark session in order to access the desired storage account. My SPN is already contributer on desired storage account MyStorageAccount.
EDIT:
After further searching, I found the following tuto, so I wrote the same code:
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", secret_id)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", "https://login.microsoftonline.com/TENANT_ID/oauth2/token")
df = spark.createDataFrame([(1, "foo")],["id", "label"])
df.write.format("delta").save("wasbs://MyContainer@MyStorageAccount.blob.core.windows.net/HERE")
But I am having the following error:
shaded.databricks.org.apache.hadoop.fs.azure.AzureException:
shaded.databricks.org.apache.hadoop.fs.azure.AzureException: Container MyContainerin account MyStorageAccount.blob.core.windows.net not found, and we can't create it using anoynomous credentials, and no credentials found for them in the configuration.
The tutorial you linked is for Azure Synapse.
Databricks allows you to mount Blob Storage using account key or SAS ->docs.
You can access Datalake storage using OAuth ->docs. So if you are able to convert your storage account (ie. enable hierarchical namespace) then you'll be able to use it.