I want to remove all duplicates from a file but ignoring the first 2 columns, I mean don't comparing those columns.
This is my example input:
111 06:22 apples, bananas and pears
112 06:28 bananas
113 07:07 apples, bananas and pears
114 07:23 apples and bananas
115 08:01 bananas and pears
116 08:23 pears
117 09:22 apples, bananas and pears
118 12:23 apples and bananas
I want this output:
111 06:22 apples, bananas and pears
112 06:28 bananas
114 07:23 apples and bananas
115 08:01 bananas and pears
116 08:23 pears
I've tried this bellow, but it only compares the third column and ignores the rest of the line:
awk '!seen[$3]++' sample.txt
Store $0
to a temporary variable, set $1
and $2
to empty, then use newly composed $0
as key:
awk '{ t = $0; $1 = $2 = "" } !seen[$0]++ { print t }' sample.txt