powershellpattern-matchingmember-enumeration

How to strip out leading time stamp?


I have some log files.
Some of the UPDATE SQL statements are getting errors, but not all.
I need to know all the statements that are getting errors so I can find the pattern of failure.

I can sort all the log files and get the unique lines, like this:

$In = "C:\temp\data"
$Out1 = "C:\temp\output1"
$Out2 = "C:\temp\output2"

Remove-Item $Out1\*.*
Remove-Item $Out2\*.*

# Get the log files from the last 90 days
Get-ChildItem $In -Filter *.log | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-90)} |
Foreach-Object {
    $content = Get-Content $_.FullName

    #filter and save content to a file
    $content | Where-Object {$_ -match 'STATEMENT'} | Sort-Object -Unique | Set-Content $Out1\$_
}

# merge all the files, sort unique, write to output
Get-Content $Out2\* | Sort-Object -Unique | Set-Content $Out3\output.txt

Works great.

But some of the logs have a leading date-time stamp in the leading 24 char. I need to strip that out, or all those lines are unique.

If it helps, all the files either have the leading timestamp or they don't. The lines are not mixed within a single file.

Here is what I have so far:

# Get the log files from the last 90 days
Get-ChildItem $In -Filter *.log | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-90)} |
Foreach-Object {
    $content = Get-Content $_.FullName

    #filter and save content to a file
    $s = $content | Where-Object {$_ -match 'STATEMENT'} 
    # strip datetime from front if exists
    If (Where-Object {$s.Substring(0,1) -Match '/d'}) { $s = $s.Substring(24) }
    $s | Sort-Object -Unique | Set-Content $Out1\$_
}

# merge all the files, sort unique, write to output
Get-Content $Out1\* | Sort-Object -Unique | Set-Content $Out2\output.txt

But it just write the lines out without stripping the leading chars.


Solution

  • $content | 
      Where-Object { $_ -match 'STATEMENT' } |
      ForEach-Object { if ($_[0] -match '\d') { $_.Substring(24) } else { $_ } } |
      Set-Content $Out1\$_
    

    Note: Strictly speaking, \d matches everything that the Unicode standard considers a digit, not just the ASCII-range digits 0 to 9; to limit matching to the latter, use [0-9].