bashsedcolorsansi-escapetput

Strip leading AND trailing ansi/tput codes from string


The application here is "sanitizing" strings for inclusion in a log file. For the sake of argument, let's assume that 1) colorizing the string at runtime is proper; and 2) I need leading and trailing spaces on screen but excess whitespace removed from the log.

The specific application here is tee-ing into a log file. Not all lines would be colorized, and not all lines would have leading/trailing spaces.

Given this, I want to

  1. Remove all codes both setting the color and resetting. The reason for this will be apparent in a moment
  2. Remove leading and trailing whitespace

When you search (anywhere) for how to strip color codes in bash, you can find many different ways to accomplish it. What I have discovered so far however is nobody seems to address the trailing reset; the $(tput sgr0). In the examples I have seen this is inconsequential, however my additional requirement to strip leading/trailing spaces complicates it/makes it a requirement.

Here is my example script which demonstrates the issue:

#!/bin/bash

# Create a string with color, leading spaces, trailing spaces, and a reset
REPLY="$(tput setaf 2)       This is green        $(tput sgr0)"
echo "Colored output:  $REPLY"
# Remove initial color code
REPLY="$(echo "$REPLY" | sed 's,\x1B\[[0-9;]*[a-zA-Z],,g')"
echo "De-colorized output:  $REPLY"
# Remove leading and trailing spaces if present
REPLY="$(printf "%s" "${REPLY#"${REPLY%%[![:space:]]*}"}" | sed -n -e 'l')"
echo "Leading spaces removed:  $REPLY"
REPLY="$(printf "%s" "${REPLY%"${REPLY##*[![:space:]]}"}" | sed -n -e 'l')"
echo "Trailing spaces removed:  $REPLY"

The output is (can't figure out how to color text here, assume the first line is green, subsequent lines are not):

screen cap

I am willing to see the error of my ways, but after about three hours trying different things, I'm pretty sure my google-fu is failing me.

Thanks for any assistance.


Solution

  • I am willing to see the error of my ways, …

    The primary error is just that the sed command removes only the Esc[… control sequences, but not the Esc(B sequence which is also part of sgr0. It works if you change it to

    … | sed 's,\x1B[[(][0-9;]*[a-zA-Z],,g'
    

    The secondary error is that the sed -n -e 'l' command adds a literal $ sign at the end of the line, hence the former trailing spaces aren't trailing anymore and therefore not removed.