regexpowershell-core

How to debug why a pwsh7 URL regex does not match


I am working on a PowerShell Core 7 script, running in Windows environment.

Requirements of the regex:

  1. Three or four part host addresses
  2. The last two parts can be one of two options: specificPartA.specificPartB -or- specificPartC.specificPartD, so the last two parts must be one of two known values.
  3. The first one or two parts cannot contain any numbers.

Examples:

Current regex:

"[^0-9]+\.[^0-9]+\.(specificPartA.specificPartB|specificPartC.specificPartD)"

It fails here:

if ("foo.specificPartA.specificPartB" -match "[^0-9]+\.[^0-9]+\.(specificPartA.specificPartB|specificPartC.specificPartD)") { write-output "matches" } else { Write-Output "doesn't match" }

3 part should match but doesn't.

How can I fix it?

I am running variations of this a pwsh terminal this to test my regex:

if ("foo.specificPartA.specificPartB" -match "[^0-9]+\.[^0-9]+\.(specificPartA.specificPartB|specificPartC.specificPartD)") { write-output "matches" } else { Write-Output "doesn't match" }

Solution

  • You could assert the start of the string, followed by matching 1+ chars other than a digit or dot, and then match a dot.

    Then you can optionally match the same pattern for the second part.

    If you want a match only, you don't need capture groups and you have to escape the dot to match it literally.

    ^[^0-9.]+\.(?:[^0-9.]+\.)?(?:specificPartA\.specificPartB|specificPartC\.specificPartD)
    

    See a regex demo.

    if ("foo.specificPartA.specificPartB" -match "[^0-9.]+\.(?:[^0-9.]+\.)?(?:specificPartA\.specificPartB|specificPartC\.specificPartD)") { write-output "matches" } else { Write-Output "doesn't match" }
    

    Output

    matches