pythonazurestorageazure-data-lakeazure-storage-account

List containers in the Azure Data Lake Storage


I'm trying to list the containers inside a specific directory within Azure Data Lake Storage account, but it doesn't seem to be any function that can handle this:

Here is my hierarchy:

assets
 root
  container1
  container2
  container3
  container4
  container5

I wrote the following function that gets the path, and it shows all of the containers even inside the containerX. What I want to achieve is to just put the name of the containers in assets/root without having other containers deeper than containerX.

import os
from azure.storage.filedatalake import DataLakeServiceClient

connection_string = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
data_lake_service_client = DataLakeServiceClient.from_connection_string(conn_str=connection_string)

filesystem_client = data_lake_service_client.get_file_system_client(file_system="assets")

paths = filesystem_client.get_paths(path="root")

for path in paths:
    if path.is_directory:
        print("\t" + path.name)

Its quite strange that there is no functions like

get_containers(path="") or list_containers(path="")

to just list them


Solution

  • Please change the following line of code:

    paths = filesystem_client.get_paths(path="root")
    

    to

    paths = filesystem_client.get_paths(path="root", recursive=False)
    

    Based on the documentation available here, default value of recursive parameter is True and that's why you are seeing subfolders as well.