multithreadingpowershellparallel-processing

Powershell script error when using parallel processing


I'm using PowerShell with -Parallel to speed up a script that checks file formats using ImageMagick. The script works without parallel processing, but it's too slow with a large number of files. When I add -Parallel, I keep getting an error. Here's the code—can someone help me figure out what's going wrong?

Error:

Parameter set cannot be resolved using the specified named parameters. One or more parameters are either missing, not allowed together, or an insufficient number of parameters have been provided for the selected parameter set.

Code:

$files | ForEach-Object -Parallel {
    param ($file, $magickPath, $errorLogFile)

    try {
        # Run ImageMagick to identify the format
        $output = & $magickPath identify -ping -quiet -format "%m" $file.FullName

        # Check if the image is NOT JPEG or PNG
        if ($output -ne "JPEG" -and $output -ne "PNG") {
            $fileName = $file.Name
            "$fileName - $output"
        }
    }
    catch {
        # Log errors
        $errorMsg = "Error processing file: $($file.FullName). Error: $($_.Exception.Message)"
        $errorMsg | Out-File -FilePath $errorLogFile -Append
    }
} -ThrottleLimit 8 -ArgumentList $magickPath, $errorLogFile | ForEach-Object {
    if ($_ -ne $null) {
        $nonJpegPngFiles += $_
    }
}

Update:

# Folder where the images are located
$folderPath = "C:\Users\johndoe\Documents\saved images"

# File where unsupported image names will be saved
$outputFile = "C:\Users\johndoe\Documents\Unsupported list\unsupported_images_list.txt"

# Path to ImageMagick executable
$magickPath = "C:\Program Files\ImageMagick-7.1.1-016\magick.exe"

# Get all files in the folder
$files = Get-ChildItem -Path $folderPath -File

# Process files in parallel
$result = $files | ForEach-Object -Parallel {
    # Run ImageMagick to identify the format, redirecting stderr to stdout
    $output = & $using:magickPath identify -ping -quiet -format '%m' $_.FullName 2>&1

    # Return the file path and output for filtering later
    [pscustomobject]@{
        Path   = $_.FullName
        Output = $output
    }
} -ThrottleLimit 8

# Filter files that are not JPEG or PNG
$unsupportedFiles = $result | Where-Object { $_.Output -notin 'JPEG', 'PNG' }

# If unsupported files exist, write them to the output file
if ($unsupportedFiles.Count -gt 0) {
    $unsupportedFiles | Select-Object -ExpandProperty Path | Out-File -FilePath $outputFile
    Write-Host "List of unsupported files has been saved to $outputFile."
} else {
    Write-Host "No unsupported files found."
}

Solution

  • There are a few issues with your code, the error happens because there is no -ArgumentList when using ForEach-Object -Parallel, in addition using a param(...) block won't have any effect, if you want to use a variable outside the scope of the parallel script block, you can use the $using: scope modifier. Aside from that, using a try / catch won't handle errors from your binary and certainly appending to a file in parallel if there was an error isn't a thread safe operation, you shouldn't do that.

    A simple way you could handle your script is by redirecting the error output from your binary and output objects from the parallel block, once you have all files processed, you can simply filter where the output from the binary was not JPEG or PNG:

    $magickPath = 'path\to\ImageMagick'
    
    $result = $files | ForEach-Object -Parallel {
        # Run ImageMagick to identify the format, redirecting stderr to stdout
        $output = & $using:magickPath identify -ping -quiet -format '%m' $_.FullName 2>&1
    
        [pscustomobject]@{
            Path   = $_.FullName
            Output = $output
        }
    }
    
    $result | Where-Object { $_.Output -notin 'JPEG', 'PNG' }
    

    Based on the update to your question using the code suggested in this answer, your code is now perfectly fine and thread safe, the only suggestion I could make is that, since we're outputting objects from the parallel loop and since each object already contains the path of each file and the output from your binary, you could export them as a CSV instead of a plain txt:

    # Using Csv as output can be more useful to know:
    #   1. The path of failed path
    #   2. The output from the binary
    $outputFile = 'C:\Users\johndoe\Documents\Unsupported list\unsupported_images_list.csv'
    
    # .... Same code as before above
    
    # Filter files that are not JPEG or PNG
    $unsupportedFiles = $result | Where-Object { $_.Output -notin 'JPEG', 'PNG' }
    
    # If unsupported files exist, write them to the output file
    if ($unsupportedFiles) {
        $unsupportedFiles | Export-Csv $outputFile -NoTypeInformation
        Write-Host "List of unsupported files has been saved to $outputFile."
    }
    else {
        Write-Host 'No unsupported files found.'
    }
    

    The other suggestion is to remove .Count -gt 0 in the if condition, this is specifically for Windows PowerShell 5.1 where if you get a single object in your output then that condition may evaluate to false because .Count would return null instead of 1, giving inaccurate information:

    # In PowerShell 5.1
    ([pscustomobject]@{ foo = 'bar' }).Count -gt 0 # false
    
    # In PowerShell 7+ you get the expected result
    ([pscustomobject]@{ foo = 'bar' }).Count -gt 0 # true