awkcomm

Trying to join two text files based on the first column in both files and want to keep all the columns of the matches from the second file


I'm trying to join two text files based on their first columns and where those columns are the same I want to keep all the columns from the second file.

List1.txt
action
adan
adap
adapka
adat
yen


List2.txt
action  e KK SS @ n
adham   a d h a m
adidas  a d i d a s
administration  e d m i n i s t r e SS @ n
administrative  e d m i n i s t r e t i v
admiral e d m aj r @ l
adnan   a d n a n
ado     a d o
adan    a d @ n
adap    a d a p
adapka  a d a p k a
adrenalin       @ d r e n @ l i n
adrian  a d r j a n
adat    a d a t
adtec   e d t e k
adult   @ d a l t
yen     j e n

I'd like to get everything from list1.txt that matches list2.txt plus all the other columns in list2.txt. List3.txt should look like this.

List3.txt
action  e KK SS @ n
adan    a d @ n
adap    a d a p
adapka  a d a p k a
adat    a d a t
yen     j e n

I've tried the following command from here:

$awk -F: 'FNR==NR{a[$1]=$0;next}{if($1 in a){print a[$1];} else {print;}}'  List1.txt List2.txt > List3.txt

I've also tried this:

$comm <(sort List2.txt) <(sort List1.txt)

Solution

  • I'm sure there are ways to do this is awk, but join is also relatively simple.

    join -1 1 -2 1 List1.txt <(sort -k 1,1 List2.txt) > List3.txt
    

    You are joining List1 based on the first column, and joining List2 also based on the first column. You then need to make sure the files are sorted in alphabetical order so join can work.

    This produces the columns you want, separated by a whitespace.

    List3.txt
    action e KK SS @ n
    adan a d @ n
    adap a d a p
    adapka a d a p k a
    adat a d a t
    yen j e n