powershellamazon-web-servicesamazon-s3aws-powershell

PowerShell for AWS: List only "folders" from S3 bucket?


Is there any easy way to use PowerShell to only get a list of "folders" from an S3 bucket, without listing every single object and just scripting a compiled list of distinct paths? There are hundreds of thousands of individual objects in the bucket I'm working in, and that would take a very long time.

It's possible this is a really stupid question and I'm sorry if that's the case, but I couldn't find anything on Google or SO to answer this. I've tried adding wildcards to -KeyPrefix and -Key params of Get-S3Object to no avail. That's the only cmdlet that seems like it might be capable of doing what I'm after.

Pointless backstory: I just want to make sure I'm transferring files to the correct, existing folders. I'm a contracted third party, so I don't have console login access and I'm not the person who maintains the AWS account.

I know this is possible using Java and C# and others, but I'm doing everything else involved with this fairly simple project in PS and was hoping to be able to stick with it.

Thanks in advance.


Solution

  • You can use the AWS Tools For PowerShell to list objects (via Get-S3Object) in the bucket and pull common prefixes from the response object.

    Below is a small library to recursively retrieve subdirectories:

    function Get-Subdirectories
    {
      param
      (
        [string] $BucketName,
        [string] $KeyPrefix,
        [bool] $Recurse
      )
    
      @(get-s3object -BucketName $BucketName -KeyPrefix $KeyPrefix -Delimiter '/') | Out-Null
    
      if($AWSHistory.LastCommand.Responses.Last.CommonPrefixes.Count -eq 0)
      {
        return
      }
    
      $AWSHistory.LastCommand.Responses.Last.CommonPrefixes
    
      if($Recurse)
      {
        $AWSHistory.LastCommand.Responses.Last.CommonPrefixes | % { Get-Subdirectories -BucketName $BucketName -KeyPrefix $_ -Recurse $Recurse }
      }
    }
    
    function Get-S3Directories
    {
      param
      (
        [string] $BucketName,
        [bool] $Recurse = $false
      )
    
      Get-Subdirectories -BucketName $BucketName -KeyPrefix '/' -Recurse $Recurse
    }
    

    This recursive function depends on updating the KeyPrefix on each iteration to check for subdirectories in each KeyPrefix passed to it. By setting the delimiter as '/', keys matching the KeyPrefix string before hitting the first occurance of the delimiter are rolled into the CommonPrefixes collection in the last response of $AWSHistory.

    To retrieve only the top-level directories in an S3 Bucket:

    PS C:/> Get-S3Directories -BucketName 'myBucket'
    

    To retrieve all directories in an S3 Bucket:

    PS C:/> Get-S3Directories -BucketName 'myBucket' -Recurse $true
    

    This will return a collection of strings, where each string is a common prefix.

    Example Output:

    myprefix/
    myprefix/txt/
    myprefix/img/
    myotherprefix/
    ...