I'm trying to list the containers inside a specific directory within Azure Data Lake Storage account, but it doesn't seem to be any function that can handle this:
Here is my hierarchy:
assets
root
container1
container2
container3
container4
container5
I wrote the following function that gets the path, and it shows all of the containers even inside the containerX. What I want to achieve is to just put the name of the containers in assets/root without having other containers deeper than containerX.
import os
from azure.storage.filedatalake import DataLakeServiceClient
connection_string = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
data_lake_service_client = DataLakeServiceClient.from_connection_string(conn_str=connection_string)
filesystem_client = data_lake_service_client.get_file_system_client(file_system="assets")
paths = filesystem_client.get_paths(path="root")
for path in paths:
if path.is_directory:
print("\t" + path.name)
Its quite strange that there is no functions like
get_containers(path="") or list_containers(path="")
to just list them
Please change the following line of code:
paths = filesystem_client.get_paths(path="root")
to
paths = filesystem_client.get_paths(path="root", recursive=False)
Based on the documentation available here
, default value of recursive
parameter is True
and that's why you are seeing subfolders as well.