This sounds simple on its face but is actually somewhat more complex. I would like to use a unix utility to delete consecutive duplicates, leaving the original. But, I would also like to preserve other duplicates that do not occur immediately after the original. For example, if we have the lines:
O B
O B
C D
T V
O B
I want the output to be:
O B
C D
T V
O B
Although the first and last lines are the same, they are not consecutive and therefore I want to keep them as unique entries.
You can do:
cat file1 | uniq > file2
or more succinctly:
uniq file1 file2
assuming file1
contains
O B
O B
C D
T V
O B
For more details, see man uniq. In particular, note that the uniq
command accepts two arguments with the following syntax: uniq [OPTION]... [INPUT [OUTPUT]]
.
Finally if you'd want to remove all duplicates (and sort the file along the way), you could do:
sort -u file1 > file2