pythonpysparkdatabricksazure-databricksazure-storage-account

Can't access mounted volume with python on Databricks


I am trying to give access to an Azure Storage Account Gen2 container to a team in their Databricks workspace by mounting it to a the dbfs, using Credential Passthrough. I want to be able to manage the access with the Active Directory, since eventually, there are containers to be mounted in readonly.

I based my code on this tutorial : https://learn.microsoft.com/en-us/azure/databricks/data/data-sources/azure/adls-passthrough#adls-aad-credentials

Extract from my conf :

"spark_conf": {
        "spark.databricks.cluster.profile": "serverless",
        "spark.databricks.passthrough.enabled": "true",
        "spark.databricks.delta.preview.enabled": "true",
        "spark.databricks.pyspark.enableProcessIsolation": "true",
        "spark.databricks.repl.allowedLanguages": "python,sql"
    }

I then run the following code :

dbutils.fs.mount(
  source = f"wasbs://data@storage_account_name.blob.core.windows.net",
  mount_point = "/mnt/data/",
  extra_configs = {
  "fs.azure.account.auth.type":"CustomAccessToken",
  "fs.azure.account.custom.token.provider.class":spark.conf.get("spark.databricks.passthrough.adls.gen2.tokenProviderClassName")
}

That is a success as I can access the volume with dbutils.

>> dbutils.fs.ls('dbfs:/mnt/storage_account_name/data')
[FileInfo(path='dbfs:/mnt/storage_account_name/data/folder/', name='folder/', size=0)]

My issue is when I run either %sh ls /dbfs/mnt/storage_account_name/data or try to access it with python

>> import os 
>> os.listdir('/dbfs/')
Out[1]: []

>> os.listdir('/dbfs/mnt/')
FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/mnt/'

I can't find out what I am missing. Is there something to configure to make it accessible to python ? Thanks.


Solution

  • Answer is plain simple.

    Local file API Limitations

    The following list enumerates the limitations in local file API usage that apply to each Databricks Runtime version.

    All - Does not support credential passthrough.
    

    Source : https://learn.microsoft.com/en-us/azure/databricks/data/databricks-file-system#local-file-apis