azuredatabricksazure-databricks

How to mount Azure Fabric OneLake with Databricks Notebook


Can someone let me know how to mount Azure Fabric Onelake.

When I mount Databricks to ADLS I would create the following code:

container_name = "root"
storage_account = "xxxxxxxxx"
key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxQ=="

url = "wasbs://" + container_name + "@" + storage_account + ".blob.core.windows.net/"
config = "fs.azure.account.key." + storage_account + ".blob.core.windows.net"

mount_folder = "/mnt/path"
mounted_list = dbutils.fs.mounts()

mounted_exist = False
for item in mounted_list:
  if mount_folder in item[0]:
    mounted_exist = True
    break

  if not mounted_exist:
    dbutils.fs.mount(source = url, mount_point = mount_folder, extra_configs = {config : key})

I have tried a similar approach to mount Azure Fabric Onelake as follows:

url = "abfss://my_workspace@onelake.dfs.fabric.microsoft.com/my_lakehouse.Lakehouse"

mount_folder = "/mnt/path"
mounted_list = dbutils.fs.mounts()

mounted_exist = False
for item in mounted_list:
  if mount_folder in item[0]:
    mounted_exist = True
    break

  if not mounted_exist:
    dbutils.fs.mount(source = url, mount_point = mount_folder)

However, the above fail's because it's still trying to mount ADLS Gen 2 storage, when it should be looking to mount onelake storage.

Any thoughts?


Solution

  • You can connect to OneLake from Azure Databricks using credential passthrough, which enables seamless authentication using your Azure Databricks login identity. You will be able to read and write to both the Files and Tables sections without needing separate credentials.

    For more details, refer to the guide on integrating OneLake with Azure Databricks.

    You can follow the below steps:

    Create the cluster by enabling the Enable credential passthrough for user-level data access option in Advanced options.

    enter image description here

    Next, you’ll need to use the ABFS path instead of the HTTPS path.
    To find the ABFS path, follow these steps:

    Navigate to your workspace. Locate the file or folder you want to access. Click on the "..." more options next to it. Select "Copy ABFS path" from the menu.

    This copied path can now be used directly in your Databricks code for reading or writing data. enter image description here enter image description here enter image description here

    Here, you copy ABFS path.

    workspace_id = "<workspace_id>"
    lakehouse_id = "<lakehouse_id>"
    

    Reading with creds passthrough:

    df = spark.read.format("parquet").load(f"abfss://{workspace_id}@onelake.dfs.fabric.microsoft.com/{lakehouse_id}/Files/data")
    df.show(10)
    

    For Writing to OneLake

    df.write.format("delta").mode("overwrite").save(f"abfss://{workspace_id}@onelake.dfs.fabric.microsoft.com/{lakehouse_id}/Tables/dbx_delta_credspass")
    

    Note: You can reference OneLake paths using either unique identifiers (GUIDs) or readable names. When using names, make sure that both the workspace and lakehouse names are free of special characters and spaces.

    Examples: