I have 15 different files that I want have a new file which include only common lines in all of them. for example:
File1:
id1
id2
id3
file2:
id2
id3
id4
file3:
id10
id2
id3
file4
id100
id45
id3
id2
I need the output be like:
newfile:
id2
id3
I know this command works for each pair of files:
grep -w -f file1 file2 > output
but i need a command to works for more than 2 files.
any suggestion please?
The same trick can be used more than once:
$ grep -w -f file1 file2 | grep -w -f file3 | grep -w -f file4
id2
id3
By the way, if you are looking for exact matches, not a regular expression matches, it is better and faster to use the -F
flag:
$ grep -wFf file1 file2 | grep -wFf file3 | grep -wFf file4
id2
id3
$ awk 'FNR==1{nfiles++; delete fseen} !($0 in fseen){fseen[$0]++; seen[$0]++} END{for (key in seen) if (seen[key]==nfiles) print key}' file1 file2 file3 file4
id3
id2
FNR==1{nfiles++; delete fseen}
Every time that we start reading a new file, we do two things: (1) increment the file counter, nfiles
. and (2) delete the array fseen
.
!($0 in fseen){fseen[$0]; seen[$0]++}
If the current line is not a key in fseen
, then add it to fseen
and increment the count for this line in seen
.
END{for (key in seen) if (seen[key]==nfiles) print key}
After we have read the last line of the last file, we look at every key in seen
. If the count for that key is equal to the number of files that we have read, nfiles
, then we print that key.