linuxgrep

How to exclude lines with duplicate strings using grep


I have a text file with the following output.

good,bad,ugly
good,good,ugly
good,good,good,bad,ugly
good,bad,bad
bad,bad,bad,bad,good
bad,ugly,good
bad,good,bad
good,good,good,good,bad
ugly,bad,good
bad,bad,bad,good,ugly

I only want to list lines that have a single occurrence of ugly and bad. Any line with multiple bads needs to be excluded.

good,bad,ugly
good,good,good,bad,ugly
bad,ugly,good
ugly,bad,good

I've tried to use the following, but it still lists lines with multiple bads.

grep -E "bad|ugly" file.txt | grep -v "\('bad'\).*\1"

Solution

  • Your current approach using grep -E "bad|ugly" matches any line with either "bad" OR "ugly", and the back-reference attempt isn't quite working.

    grep -E 'bad.*ugly|ugly.*bad' file.txt | grep -v 'bad.*bad'
    

    This will give you:

    good,bad,ugly
    good,good,ugly,bad
    ugly,bad,good