Below is a sorted (on the basis of column one) tab-delimited file named file.txt
barbie 325 social activist
david 214 IT professional
david 457 mathematician
david 458 biologist
john 85 engineer
john 98 doctor
peter 100 statistician
I want to run the uniq
command on the basis of column one using options (-t
and -k
in case of the sort
command).
uniq -d (-t$'\t' -k1,1) file.txt # this is incorrect syntax in brackets, but I want to run it in similar way
This should be quite easy but I am unable to find my way.
What can I do to get output as:
david 214 IT professional
john 85 engineer
Debian uniq
used to have this option, but it was removed for compatibility reasons. You can create your own AWK or Perl script easily. This prints only the lines with the first occurrence of the first field:
awk -F '\t' '!x[$1]++' file.txt
x[$1]
is an associative array on the contents of the first field ($1
); it gets incremented for each line, but it is also the as the condition which specifies whether or not the current line should be printed; with the negation, it is true only if this field value has not been encountered before. (Reminder: the general form of an AWK script is zero or more of condition { action }
and both parts are optional; if {action}
is missing, the default action is to print the current line. [If the condition is missing, the action is taken unconditionally.])