regexgrepmatchexact-match

Extract only whole word using grep


I've got a big text file. I need to extract all the lines which contains the exact word "DUSP1". Here an example of the lines:

9606    ENSP00000239223 DUSP1   BLAST
9606    ENSP00000239223 DUSP1-001 Ensembl

I want to retrieve the first line but not the second one.

I tried several commands as:

grep -E "^DUSP1"
grep '\<DUSP1\>'
grep '^DUSP1$'
grep -w DUSP1

But none of them seem to work. Which option should I use?


Solution

  • The problem you are facing is that a dash (-) is considered by grep as a word delimiter.

    You should try this command :

    grep '\sDUSP1\s' file
    

    to ensure that there's spaces around your word.
    Or use words boundaries :

    grep '\bDUSP1\b' file