I am trying to gather data from several servers using ForEach-Object -Parallel. The variable I use is being populated within the loop, but when the loop finishes the variable is empty.
$DBDetails = "SELECT @@VERSION"
$VMs = ("vm1", "vm2", "vm3", "vm4", "vm5", "vm6", "vm7")
$DBInventory = @()
$scriptBlock = {
$vm = $_
$result = Invoke-Sqlcmd -ServerInstance $vm -Query $using:DBDetails
$DBInventory += $result
Write-Host "Added $($result.Count) rows from $($vm)"
}
$VMs | ForEach-Object -Parallel $scriptBlock
Write-Host "Number of elements in DBInventory: $($DBInventory.Count)"
I expect the last line to return the number of elements gathered within the loop that executed on the previous line. There should be a total of 7 elements, but I am left with none.
My result looks like this:
Added 1 rows from vm1
Added 1 rows from vm2
Added 1 rows from vm3
Added 1 rows from vm4
Added 1 rows from vm5
Added 1 rows from vm6
Added 1 rows from vm7
Number of elements in DBInventory: 0
ForEach-Object -Parallel
causes execution of the loop body in a separate runspace, meaning you don't have direct access to the variables defined in the calling scope.
To work around this, make two changes to your code:
[List[psobject]]
)using:
scope modifier and assign to a local inside the blockThe resulting local variable will then reference the same list-object in memory, and changes made to that list via its methods (Add()
, Remove()
, AddRange()
, etc.) will be reflected anywhere else its referenced (including the original $DBInventory
variable from your calling scope).
$DBDetails = "SELECT @@VERSION"
$VMs = ("vm1", "vm2", "vm3", "vm4", "vm5", "vm6", "vm7")
$DBInventory = [System.Collections.Generic.List[psobject]]::new()
$scriptBlock = {
$vm = $_
$inventory = $using:DBInventory
$result = Invoke-Sqlcmd -ServerInstance $vm -Query $using:DBDetails
$inventory.AddRange([psobject[]]$result)
Write-Host "Added $($result.Count) rows from $($vm)"
}
$VMs | ForEach-Object -Parallel $scriptBlock
Write-Host "Number of elements in DBInventory: $($DBInventory.Count)"
As mklement0 notes, [List[psobject]]
is not thread-safe - for production code you'll definitely want to pick a collection type that is, like for example a [System.Collections.Concurrent.ConcurrenBag[psobject]]
- essentially an unordered list:
$DBInventory = [System.Collections.Concurrent.ConcurrentBag[psobject]]::new()
Beware that the ConcurrentBag
type, as the name might suggest, does not preserve insertion order. If this is a problem, you may want to consider using a [ConcurrentDictionary[string,psobject[]]]
- this way you can tie the query output back to the orignal input string:
$DBInventory = [System.Collections.Concurrent.ConcurrentDictionary[string,psobject[]]]::new()
Since another thread may (hypothetically) have added an entry for the same key since you dispatched your call to Add()
, the ConcurrentDictionary
type requires us to use it slightly differently than a regular dictionary or hashtable:
$scriptBlock = {
$vm = $_
$inventory = $using:DBInventory
$result = Invoke-Sqlcmd -ServerInstance $vm -Query $using:DBDetails
$adder = $updater = { return Write-Output $result -NoEnumerate }
$inventory.AddOrUpdate($vm, $adder, $updater)
Write-Host "Added $($result.Count) rows from $($vm)"
}
Here, the concurrent dictionary will execute the $adder
function on our behalf if the key doesn't already exist (otherwise it'll run the $updater
), and the result will be assigned as the entry value.
You can subsequently access the entry values the same way you would a hashtable:
$DBInventory[$vms[-1]] # returns array containing the query results from the last VM in the list