linuxunixjoin

unix join command to return all columns in one file


I have two files that I am joining on one column. After the join, I just want the output to be all of the columns, in the original order, from only one of the files. For example:

cat file1.tsv 
1       a       ant
2       b       bat
3       c       cat
8       d       dog
9       e       eel

cat file2.tsv 
1       I
2       II
3       III
4       IV
5       V

join -1 1 -2 1 file1.tsv file2.tsv -t $'\t' -o 1.1,1.2,1.3
1       a       ant
2       b       bat
3       c       cat

I know I an use -o 1.1,1.2.. notation but my file has over two dozen columns. Is there some wildcard that I can use to say -o 1.* or something?


Solution

  • I'm not aware of wildcards in the format string.

    From your desired output I think that what you want may be achievable like so without having to specify all the enumerations:

    grep -f <(awk '{print $1}' file2.tsv ) file1.tsv
    1       a       ant
    2       b       bat
    3       c       cat
    

    Or as an awk-only solution:

    awk '{if(NR==FNR){a[$1]++}else{if($1 in a){print}}}' file2.tsv file1.tsv
    1       a       ant
    2       b       bat
    3       c       cat