powershellforeachparallel-processingpipepowershell-sdk

How to pass $_ ($PSItem) in a ScriptBlock


I'm basically building my own parallel foreach pipeline function, using runspaces.

My problem is: I call my function like this:

somePipeline | MyNewForeachFunction { scriptBlockHere } | pipelineGoesOn...

How can I pass the $_ parameter correctly into the ScriptBlock? It works when the ScriptBlock contains as first line

param($_)

But as you might have noticed, the powershell built-in ForEach-Object and Where-Object do not need such a parameter declaration in every ScriptBlock that is passed to them.

Thanks for your answers in advance fjf2002

EDIT:

The goal is: I want comfort for the users of function MyNewForeachFunction - they shoudln't need to write a line param($_) in their script blocks.

Inside MyNewForeachFunction, The ScriptBlock is currently called via

$PSInstance = [powershell]::Create().AddScript($ScriptBlock).AddParameter('_', $_)
$PSInstance.BeginInvoke()

EDIT2:

The point is, how does for example the implementation of the built-in function ForEach-Object achieve that $_ need't be declared as a parameter in its ScriptBlock parameter, and can I use that functionality, too?

(If the answer is, ForEach-Object is a built-in function and uses some magic I can't use, then this would disqualify the language PowerShell as a whole in my opinion)

EDIT3:

Thanks to mklement0, I could finally build my general foreach loop. Here's the code:

function ForEachParallel {
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory)] [ScriptBlock] $ScriptBlock,
        [Parameter(Mandatory=$false)] [int] $PoolSize = 20,
        [Parameter(ValueFromPipeline)] $PipelineObject
    )

    Begin {
        $RunspacePool = [runspacefactory]::CreateRunspacePool(1, $poolSize)
        $RunspacePool.Open()
        $Runspaces = @()
    }

    Process {
        $PSInstance = [powershell]::Create().
            AddCommand('Set-Variable').AddParameter('Name', '_').AddParameter('Value', $PipelineObject).
            AddCommand('Set-Variable').AddParameter('Name', 'ErrorActionPreference').AddParameter('Value', 'Stop').
            AddScript($ScriptBlock)

        $PSInstance.RunspacePool = $RunspacePool

        $Runspaces += New-Object PSObject -Property @{
            Instance = $PSInstance
            IAResult = $PSInstance.BeginInvoke()
            Argument = $PipelineObject
        }
    }

    End {
        while($True) {
            $completedRunspaces = @($Runspaces | where {$_.IAResult.IsCompleted})

            $completedRunspaces | foreach {
                Write-Output $_.Instance.EndInvoke($_.IAResult)
                $_.Instance.Dispose()
            }

            if($completedRunspaces.Count -eq $Runspaces.Count) {
                break
            }

            $Runspaces = @($Runspaces | where { $completedRunspaces -notcontains $_ })
            Start-Sleep -Milliseconds 250
        }

        $RunspacePool.Close()
        $RunspacePool.Dispose()
    }
}

Code partly from MathiasR.Jessen, Why PowerShell workflow is significantly slower than non-workflow script for XML file analysis


Solution

  • The key is to define $_ as a variable that your script block can see, via a call to Set-Variable.

    Here's a simple example:

    function MyNewForeachFunction {
      [CmdletBinding()]
      param(
        [Parameter(Mandatory)]
        [scriptblock] $ScriptBlock
        ,
        [Parameter(ValueFromPipeline)]
        $InputObject
      )
    
      process {
        $PSInstance = [powershell]::Create()
    
        # Add a call to define $_ based on the current pipeline input object
        $null = $PSInstance.
          AddCommand('Set-Variable').
            AddParameter('Name', '_').
            AddParameter('Value', $InputObject).
          AddScript($ScriptBlock)
    
        $PSInstance.Invoke()
      }
    
    }
    
    # Invoke with sample values.
    1, (Get-Date) | MyNewForeachFunction { "[$_]" }
    

    The above yields something like:

    [1]
    [10/26/2018 00:17:37]