powershellsplitcompareobject

In Powershell, how do I split text from Compare-Object input and sort by one of the split values?


Here is a sample input from two different files hash1.txt and hash2.txt:

hash1.txt:

abcdef01234567890 \path\to\file1.txt
01234567890abcdef \path\to\file2.txt
a0b1c2d3e4f567890 \path\to\file3.txt

hash2.txt

abcdef01234567890 \path\to\file1.txt
11234567890abcdef \path\to\file2.txt
q0b1c2d3e4f567890 \path\to\file3.txt

I can do a simple compare with:

Compare-Object -ReferenceObject (Get-Content "$hash1" | Select -Skip $numskip1) -DifferenceObject (Get-Content "$hash2" | Select -Skip $numskip2 )

The default Compare-Object output is fine, but would like to sort it by matching file paths separated by a blank line.

So in this case file2.txt and file3.txt have a different hash value with matching file paths. But how do I go about providing output like this:

01234567890abcdef \path\to\file2.txt    <=
11234567890abcdef \path\to\file2.txt    =>

a0b1c2d3e4f567890 \path\to\file3.txt    <=
q0b1c2d3e4f567890 \path\to\file3.txt    =>

Sorry, I'm not a programmer, I'm just trying to figure out how to make it easier to sort through a pile of data.


Solution

  • You can use Group-Object to group your results by file path and then use ForEach-Object to format the members of each group separately with Format-Table:

    Compare-Object -ReferenceObject (Get-Content "$hash1" | Select -Skip $numskip1) -DifferenceObject (Get-Content "$hash2" | Select -Skip $numskip2 ) |
      Group-Object { $_.InputObject -replace '^.+? ' } | 
      ForEach-Object { 
        $_.Group | Format-Table -HideTableHeaders | 
          Out-String | ForEach-Object TrimEnd
      }
    

    Note that the output of this command is suitable for display only, due to use of a Format-* cmdlet.

    The purpose of the script block ({ ... }) acting as a calculated property passed to Group-Object is to group only by the file-path portion of the line:

    The purpose of the Out-String | ForEach-Object TrimEnd part - courtesy of zett42 - is to reduce the number of empty lines between groups to 1.
    You'll still get an empty initial line, however.
    To eliminate it too, append | Out-String | ForEach-Object Trim to the entire pipeline.