The problem
After setting up Unity Catalog and a managed Volume, I can upload files to the volume and download files from the volume on Databricks Workspace UI.
However, I cannot access the volume from notebook. I created an All-purpose compute, and run dbutils.fs.ls("/Volumes/catalog1/schema1/volumn11")
. Then I got the error
Operation failed: "This request is not authorized to perform this operation.", 403, GET
How we set up Unity Catalog and Managed Volume
metastore1
adsl_gen2_1
access_connector_for_dbr_1
adsl_gen2_1
, I assigned the roles Storage Blob Data Contributor
and Storage Queue Data Contributor
to the access_connector_for_dbr_1
adsl_gen2_1
adsl_gen2_1_container_catalog_default
adsl_gen2_1_container_schema1
dbr_strg_cred_1
connector id
is the resource id of access_connector_for_dbr_1
dbr_strg_cred_1
dbr_ext_loc_catalog_default
, points to the ADSL Gen2 Container adsl_gen2_1_container_catalog_default
, and the Permissions of this External Location were not set (empty)dbr_ext_loc_schema1
, points to the ADSL Gen2 Container adsl_gen2_1_container_schema1
, and the Permissions of this External Location were not set (empty)catalog1
, under metastore1
, and set dbr_ext_loc_catalog_default
as this catalog's Storage Locationschema1
, under catalog1
, and set dbr_ext_loc_schema1
as this schema's Storage Locationvolumn11
, under schema1
.volume11
dbutils.fs.ls("/Volumes/catalog1/schema1/volumn11")
dbutils.fs.ls("dbfs:/Volumes/catalog1/schema1/volumn11")
spark.read.format("csv").option("header","True").load("/Volumes/catalog1/schema1/volumn11/123.csv")
spark.read.format("csv").option("header","True").load("dbfs:/Volumes/catalog1/schema1/volumn11/123.csv")
Details about the All-purpose compute
I found the reason and a solution, but I feel this is a bug. And I wonder what is the best practice.
When I enable the ADSL Gen2's Public network access from all networks as shown below, I can access the volume from a notebook.
However, if I enable the ADSL Gen2's Public network access from selected virtual networks and IP addresses as shown below, I cannot access the volume from a notebook. Even though I added the VM's public IP to the whitelist, added the resource Microsoft.Databricks/accessConnectors
to the resource instances, and enabled the Exceptions Allow Azure services on the trusted services list to access this storage account
. As I understand, my compute has the Unity Catalog badge, it should access the ADSL Gen2 via the Access Connector for Databricks (Managed Identity), so it should be able to access the ADSL Gen2 via the Access Connector for Databricks.