regexpowershellregex-lookaroundsselect-string

Select-string regex


I'm searching a ton of logs using a foreach loop for a string ($text) and currently outputting the entire line to an output file ($logfile)

Get-ChildItem "\\$server\$Path" -Filter "*.log" |select-string -pattern $text |select -expandproperty line |out-file $logfile -append

A sample line of one of the log files might look like this

May 25 04:08:36.640 2016 AUDITOF GUID 1312.2657.11075.54819.13021094807.198 opened by USER

where $text = "opened by USER"

All of this works fine and it spits out every line of every log file that includes $text which is great.

But.. what I think I'd like to do is get an output of the date time and the GUID. The Guid can change formats, lengths, etc., but it will always have dots and will always follow GUID (space) and precede (space) opened

In short, I'm trying to regex using a lookbehind (or lookforward) or match that would return something like this to the $logfile

May 25 04:08:36.640 2016,1312.2657.11075.54819.13021094807.198

Any help appreciated. I'm lousy with Regex.


Solution

  • One way would be to do this

    $result = Get-ChildItem "\\$server\$Path" -Filter "*.log" -File | 
              Select-String -Pattern $text -SimpleMatch |
              Select-Object -ExpandProperty Line |
              ForEach-Object {
                  if ($_ -match '([a-z]{3,}\s*\d{2}\s*\d{2}:\d{2}:\d{2}\.\d{3}\s*\d{4}).*GUID ([\d.]+)') {
                      '{0},{1}' -f $matches[1], $matches[2]
                  }
              }
    
    $result | Out-File $logfile -Append 
    

    Explanation:

    Regex details:

    (               Match the regular expression below and capture its match into backreference number 1
       [a-z]        Match a single character in the range between “a” and “z”
          {3,}      Between 3 and unlimited times, as many times as possible, giving back as needed (greedy)
       \s           Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
          *         Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
       \d           Match a single digit 0..9
          {2}       Exactly 2 times
       \s           Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
          *         Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
       \d           Match a single digit 0..9
          {2}       Exactly 2 times
       :            Match the character “:” literally
       \d           Match a single digit 0..9
          {2}       Exactly 2 times
       :            Match the character “:” literally
       \d           Match a single digit 0..9
          {2}       Exactly 2 times
       \.           Match the character “.” literally
       \d           Match a single digit 0..9
          {3}       Exactly 3 times
       \s           Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
          *         Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
       \d           Match a single digit 0..9
          {4}       Exactly 4 times
    )
    .               Match any single character that is not a line break character
       *            Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    GUID\           Match the characters “GUID ” literally
    (               Match the regular expression below and capture its match into backreference number 2
       [\d.]        Match a single character present in the list below
                    A single digit 0..9
                    The character “.”
          +         Between one and unlimited times, as many times as possible, giving back as needed (greedy)
    )