bashpunctuationcharacter-class

How to remove punctuation from a string with exceptions using regex in bash


Using the command echo "Jiro. Inagaki' & Soul, Media_Breeze." | tr -d '[:punct:]' prints the string "Jiro Inagaki Soul MediaBreeze".

However, I want to find a regular expression that will remove all punctuation except the underscore and ampersand i.e. I want "Jiro Inagaki & Soul Media_Breeze".

Following advice on character class subtraction from the sources listed at the bottom, I've tried replacing [:punct:] with the following:

... but I haven't gotten anything to work so far. Any help would be much appreciated!

Sources:


Solution

  • You can specify the punctuation marks you want removed, e.g.

    >echo "Jiro. Inagaki' & Soul, Media_Breeze." | tr -d "[.,/\\-\=\+\{\[\]\}\!\@\#\$\%\^\*\'\\\(\)]"
    Jiro Inagaki & Soul Media_Breeze
    

    Or, alternatively,

    >echo "Jiro. Inagaki' & Soul, Media_Breeze." | tr -dc '[:alnum:] &_'
    Jiro Inagaki & Soul Media_Breeze