powershellreplace

Odd behavior using -replace with Underscore Character


I've noticed a bit of odd behavior with the -replace command in PowerShell, and I'm just wondering if anyone out there can tell me what's going on.

I was writing some code to make sure that a new username entered in a textbox only contained numbers, letters (which were previously made lower case), dots, dashes, and underscores. The method I used was to copy the entered text into another variable, but use -replace to strip off everything valid, and if anything was left behind use another -replace to remove that from the textbox.

However, when I tried the code as:

$textboxNewUserName.Text -replace "[0-9a-z.-_]", ""

I found that I could enter whatever I wanted to because this statement would always return nothing.

On the other hand, this code works fine:

$textboxNewUserName.Text -replace "[0-9a-z._-]", ""

Can anyone tell me why the placement of the underscore makes any difference?


Solution

  • In order to match a - (HYPHEN-MINUS, U+002D) char. verbatim inside [...], a (positive) character-group regex expression, you must place it either at the very start or at the very end; otherwise, it is interpreted as a metacharacter that separates the endpoints of a range of characters, such as in a-z


    Two asides:

    Therefore:

    # Removes all chars. OTHER than 0-9, a-z (case-insensitively) and . and _
    $textboxNewUserName.Text -replace '[^0-9a-z._-]'