powershellperformancerecursionfilteringget-childitem

Improve on search and archive time in GitHub Actions PowerShell


Below is my workflow on PowerShell that searches for files and folders provided as a comma-separated list to itemsToInclude:

      $zipFileName = "${{ github.workspace }}\package-$env:GITHUB_RUN_NUMBER.zip"

      cd "${{ github.workspace }}"

      $itemsToInclude = $env:INPUTS_FILESINZIP

      Write-Host "itemsToInclude is- $itemsToInclude"

 

      if (-not (Test-Path $zipFileName)) {

        $null = New-Item $zipFileName -ItemType File

      }

      $workspace = "${{ github.workspace }}"

      # Define the directories to exclude

      $excludeDirectories = @('DevOps')
      $excludeExtensions = @('.java', '.class')


      # Include specific files and folders as per the comma-separated list

      Write-Host 'Include specific files and folders as per the comma-separated list'

      $itemsToIncludeList = $itemsToInclude -split ','
      $filesToInclude = Get-ChildItem $workspace -Recurse -File  -Exclude $excludeDirectories | Where-Object {
      $itemName = $_.Name

      Write-Host "Checking file: $itemName"

      $itemsToIncludeList -contains $itemName

      }

 
      $filesToInclude | ForEach-Object {

        $newZipEntrySplat = @{

          EntryPath   = $_.FullName.Substring($workspace.Length)
          SourcePath  = $_.FullName
          Destination = $zipFileName

          }

          Write-Host "Adding file: $($_.FullName)"
          New-ZipEntry @newZipEntrySplat

        }

      Write-Host "Zip file created: $zipFileName"

    env:
      INPUTS_FILESINZIP: ${{ inputs.filesinzip }}

The time it takes to search for desired files and include them in ZIP is more than acceptable.

Thus, I wish to exclude the folder DevOps and all files having extensions .java and .class so the time taken for this step is reduced.

Unfortunately, the -Exclude option does not work and I can see all files inside the AreDevOps folder listed in the output for Checking file:

Can you please suggest?


Solution

  • What you're looking for is to exclude an entire directory subtree from enumeration from a recursive Get-ChildItem call with -Exclude.

    Unfortunately, this is not directly supported in Windows PowerShell and still not as of PowerShell (Core) 7.4:

    GitHub issue #15159 is a feature request to also support excluding the entire subtrees of matching subdirectories.


    Workarounds:

    If the subdirectories whose subtrees you want to exclude are all top-level, i.e. immediate child items of the target directory, you can use a two-step approach:

    $filesToInclude = 
      Get-ChildItem $workspace -Exclude $excludeDirectories |
      Get-ChildItem -Recurse -File -Exclude $excludeExtensions
    

    If you need to exclude the subtrees of directories matching given names on any level of the input subtree, you will need post-filtering, which results in much slower execution:

    # Construct a regex from the exclusion patterns.
    # NOTE: The individual patterns must be
    #   * either: *literal* names, such as 'DevOps'
    #     * for [ and ] to be used *literally*, escape them as \[ and \]
    #   * or: *regexes* rather than *wildcard* patterns; e.g.:
    #     * instead of 'Foo*', use 'Foo.*?'
    #     * instead of 'Foo?', use 'Foo.'
    $regex = 
      '(?<=^|[\\/])(?:{0})(?=[\\/]|$)' -f ($excludeDirectories -join '|')
    
    $filesToInclude =
      & {
        # Output the target directory itself, alone.
        Get-Item $workspace  
        # Recurse over subdirectories only and exclude matching subtrees.
        Get-ChildItem -Recurse -Directory $workspace |
          Where-Object { $_.FullName -notmatch $regex } 
      } | # Now enumerate all files in the non-excluded directories.
      Get-ChildItem -Recurse -File -Exclude $excludeExtensions