javascriptazureazure-blob-storage

Azure Datalake Storage list first level of directories in container


I have a container named foo and a couple of directories in it with the following hierarchy:

foo\dir1
foo\dir2
...

How can I retrieve only the dir1 & dir2 directories? Currently I'm using Azure.Storage.Blobs (12.9.1) library.

What I've tried:

      var blobContainerClient = blobServiceClient.GetBlobContainerClient("foo");

      var resultSegment = blobContainerClient.GetBlobs().AsPages();
      IList<string> blobs = new List<string>();
      foreach(Azure.Page<BlobItem> blobPage in resultSegment)
      {
        foreach(BlobItem blobItem in blobPage.Values)
        {
          blobs.Add(blobItem.Name);
        } 
      }

      return blobs;
    }

This returns recursively all files that I have in the foo container. I need to mention that this is a hierarchical namespace storage, and I've tried this solution but it doesn't work because each directory is considered to be a blob I think


Solution

  • I found a solution, but I do not know if this is the best one. When another solution will appear I will delete this.

    So in a hierarchical storage account (data lake) even a directory is considered to be a blob if I got it correctly. In that case, I observed that the directories have the contentLength = 0 and contentHash is a byte[0]. With these assumptions in mind I managed to do the following:

          var blobContainerClient = blobServiceClient.GetBlobContainerClient("foo");
    
          return blobContainerClient.GetBlobs()
            .Where(b => b.Properties.ContentLength == 0 && b.Properties.ContentHash.Length == 0)
            .Select(b => b.Name)
            .ToList();