arraysperformancepowershellappendarray-initialization

How to fill an array efficiently in Powershell


I want to fill up a dynamic array with the same integer value as fast as possible using Powershell.
The Measure-Command shows that it takes 7 seconds on my system to fill it up.
My current code (snipped) looks like:

$myArray = @()
$length = 16385
for ($i=1;$i -le $length; $i++) {$myArray += 2}  

(Full code can be seen on gist.github.com or on superuser)

Consider that $length can change. But for better understanding I chose a fixed length.

Q: How do I speed up this Powershell code?


Solution

  • You can repeat arrays, just as you can do with strings:

    $myArray = ,2 * $length
    

    This means »Take the array with the single element 2 and repeat it $length times, yielding a new array.«.

    Note that you cannot really use this to create multidimensional arrays because the following:

    $some2darray = ,(,2 * 1000) * 1000
    

    will just create 1000 references to the inner array, making them useless for manipulation. In that case you can use a hybrid strategy. I have used

    $some2darray = 1..1000 | ForEach-Object { ,(,2 * 1000) }
    

    in the past, but below performance measurements suggest that

    $some2darray = foreach ($i in 1..1000) { ,(,2 * 1000) }
    

    would be a much faster way.


    Some performance measurements:

    Command                                                  Average Time (ms)
    -------                                                  -----------------
    $a = ,2 * $length                                                 0,135902 # my own
    [int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)           7,15362 # JPBlanc
    $a = foreach ($i in 1..$length) { 2 }                             14,54417
    [int[]]$a = -split "2 " * $length                                24,867394
    $a = for ($i = 0; $i -lt $length; $i++) { 2 }                    45,771122 # Ansgar
    $a = 1..$length | %{ 2 }                                         431,70304 # JPBlanc
    $a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }       10425,79214 # original code
    

    Taken by running each variant 50 times through Measure-Command, each with the same value for $length, and averaging the results.

    Position 3 and 4 are a bit of a surprise, actually. Apparently it's much better to foreach over a range instead of using a normal for loop.


    Code to generate above chart:

    $length = 16384
    
    $tests = '$a = ,2 * $length',
             '[int[]]$a = [System.Linq.Enumerable]::Repeat(2, $length)',
             '$a = for ($i = 0; $i -lt $length; $i++) { 2 }',
             '$a = foreach ($i in 1..$length) { 2 }',
             '$a = 1..$length | %{ 2 }',
             '$a = @(); for ($i = 0; $i -lt $length; $i++) { $a += 2 }',
             '[int[]]$a = -split "2 " * $length'
    
    $tests | ForEach-Object {
        $cmd = $_
        $timings = 1..50 | ForEach-Object {
            Remove-Variable i,a -ErrorAction Ignore
            [GC]::Collect()
            Measure-Command { Invoke-Expression $cmd }
        }
        [pscustomobject]@{
            Command = $cmd
            'Average Time (ms)' = ($timings | Measure-Object -Average TotalMilliseconds).Average
        }
    } | Sort-Object Ave* | Format-Table -AutoSize -Wrap