powershellsplit

How to split a string using a regex in Powershell?


I want to reformat a string such as pcst400801 to have a hyphen before the last 4 digits: pcst40-0801. I've tried using -split and .Split() with no success. They both provide different results. -split seems to be closest but neither method provides me with something I can do anything with.

.Split():

> $string = "pcst400801"
> $string.Split("(?=\D{4}\d{2})")
pcst400801

-split:

> $string = "pcst400801"
> $string -split "(\D{4}\d{2})"          

pcst40
0801
> $s1,$s2 = $string -split "(\D{4}\d{2})"
> $s1

> $s2
pcst40
0801

In the -split example, it sets $s1 to null and sets $s2 to both of the substrings at the point I want them split. I then tried to split $s2 using \n as the delimiter but got an error about calling a method on a null-valued expression.

I also tried $s1,$s2 = $string -split "(?=\D{4}\d{2})" which gave the same result as not using ?=. $s1 was still null but $s2 was the entire string.

How do I split the string at the point I want and assign each substring to a variable?


Solution

  • To complement the existing, helpful answers (especially iRon's detailed explanation):

    Since your intent is to insert a character (or substring) into an existing string,[1] -replace, the regular-expression-based string replacement operator, is a natural choice:

    # Insert a "-" before the last 4 digits (\d) at the end ($)
    'pcst400801' -replace '\d{4}$', '-$&'
    
    # Alternative, using your original approach:
    # Insert a "-" after 4 non-digits (\D) followed by 2 digits (\d) at the start (^)
    'pcst400801' -replace '^\D{4}\d{2}', '$&-'
    

    Both expressions return the desired string, 'pcst40-0801'

    Note:


    [1] Loosely speaking; technically, it is a new string that must be created with the desired insertion, given that .NET strings (System.String, [string] in PowerShell terms) are immutable.