regexpowershellspecial-characterscharacter-replacementrename-item-cmdlet

Powershell Remove Special Character(s) from Filenames


I am looking for a way to remove several special characters from filenames via a powershell script.

My filenames look like this:

[Report]_first_day_of_month_01_(generated_by_powershell)_[repnbr1].txt

I have been puzzling over removing the [] and everything between them, the () and everything between those, and removing all the _'s as well, with the desired result being a filename that looks like this:

first day of month 01.txt

Thus far, I have tried the below solution to no avail. I have run each of these from the directory in which the files reside.

Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
$NewName2 = $NewName -Replace "[\s*]\[.*\]",''
$NewName3 = $NewName2 -Replace "_",' '
Rename-Item -Path $_ -NewName $NewName3
}

Since it does not work even if I try and do one set at a time like this:

Get-ChildItem -Path .\ -Filter *.mkv | %{
$Name = $_.Name
$NewName = $Name -Replace "(\s*)\(.*\)",''
Rename-Item -Path $_ -NewName $NewName
}

I assume there is an inherent flaw in the way I am trying to accomplish this task. That being said, I would prefer to use the Rename-Item cmdlet rather than using a move-item solution.


Solution

  • gci *.txt | Rename-Item -NewName {$_ -replace '_*(\[.*?\]|\(.*?\))_*' -replace '_+', ' '}
    

    The rename is a regex which matches [text] or (text) blocks and replaces them with nothing. Parentheses and brackets need escaping in regexes to match them literally. It matches them with optional leading or trailing underscores to get [Report]_ or _[repnbr1] because it would leave _ at the start or end of the name and they would become leading/trailing spaces, which is annoying. Then it replaces remaining underscores with spaces.

    See the regex working here: Regex101