uniquetext-processingcomm

Find unique lines between two files


I have two very large files (File 1 and File 2), File 1 has many rows and columns, I am pasting column 1 for sake of simplicity. I want to print only those lines which are unique to File 1.

File 1:

AT1G01010.1
AT1G01020_P1
AT1G01020_P2
AT1G01040.2
AT1G01040_P1
AT1G01046.1
AT1G01050_ID7

File 2:

AT1G01010
AT1G01046
AT1G01050

Output:

AT1G01020_P1
AT1G01020_P2
AT1G01040.2
AT1G01040_P1

I have tried comm command in Ubuntu but it didn't work as it checks for complete pattern. so when it tries to check AT1G01010.1 with AT1G01010 it doesn't show anything common.


Solution

  • Try:

    grep -Fvf file2 file1
    

    This will print the lines which no whole or partially matched with the lines in file2.