bashjoinpipezsh

join with |, sort on three files


I've tried to understood CLI's 'join' and came to this:

file 1 (nobel_laureates.txt):

1901,Jean Henri Dunant,M
1901,Frederic Passy,M
1902,Elie Ducommun,M
1905,Baroness Bertha Sophie Felicita Von Suttner,F
1910,Permanent International Peace Bureau,

file 2 (nobel_nationalities.txt):

Jean Henri Dunant,Switzerland 
Frederic Passy,France 
Elie Ducommun,Switzerland 
Baroness Bertha Sophie Felicita Von Suttner

file 3 (capitals.txt):

Belgium,Brussels
France,Paris
Italy,Rome
Switzerland

I've tried

join -t, -1 2 -o 1.2 2.2 nobel_laureates.txt nobel_nationalities.txt | sort -k2 -t, | join -t, -e "<<NULL>>" -1 2 -o 1.1 2.2 - capitals.txt

and got

usage: join [-a fileno | -v fileno ] [-e string] [-1 field] [-2 field]
            [-o list] [-t char] file1 file2
usage: join [-a fileno | -v fileno ] [-e string] [-1 field] [-2 field]
            [-o list] [-t char] file1 file2

What's wrong with it?


Solution

  • Extract from the standard manpage (check the sentences that I emphasized in the last paragraph):

    -o list

    Construct the output line to comprise the fields specified in list, each element of which shall have one of the following two forms:

    1. file_number.field, where file_number is a file number and field is a decimal integer field number
    2. 0 (zero), representing the join field

    The elements of list shall be either <comma>-separated or <blank>-separated, as specified in Guideline 8 of XBD 12.2 Utility Syntax Guidelines. The fields specified by list shall be written for all selected output lines. Fields selected by list that do not appear in the input shall be treated as empty output fields. (See the -e option.) Only specifically requested fields shall be written. The application shall ensure that list is a single command line argument.


    So, your join -o 1.2 2.2 ... command shall be written as join -o 1.2,2.2 ... or join -o '1.2 2.2' ... to be POSIX compliant