shellsortingaix

Shell sort command with multiple priority but without "-k"


I'm on a very old AIX 6.1 system and I need to understand a "sort" command. Due to the age of the system, I can't find any documentation. And the '-h' option is not very helpfull.

Usage:   sort   [-Abcdfimnru] [-T Directory] [-t Character] [-o File]
                [-y[Kilobytes]] [-z Recordsize] [-k Keydefinition]...
                [[+Position1][-Position2]]... [File]...

Here is a text file:

01_31_10:00_NNN_01_03_lkdjgldkfjgfdgjiod_001
08_07_13:21_NNN_05_03_dslkfhgkdlsgklfghd_001
01_31_10:01_NNN_06_03_fldkhdslkfkldfnsdl_001
01_31_10:01_NNN_05_03_dsflkdfiouzniudbcz_001
01_31_10:01_NNN_02_03_dslknfdjfhzeiuhbsd_001
01_31_10:21_NNN_01_03_dlksnfiunfzaiufskf_001
12_05_15:02_NNN_01_03_psqiodjksjbfisuvbz_001
01_31_10:21_NNN_08_03_sdlqkfhsnfiuqehfiu_001
08_07_13:50_NNN_01_03_dslknfaocnbsudfhgs_001
01_31_10:20_NNN_03_03_dlskfdksfusehbrksu_001
01_29_14:25_NNN_04_04_dsfihsdgnsdoiuvsdh_001
12_05_12:01_NNN_02_03_dlskhvosnvoihvslfi_001

Where: 1st field is the month, 2nd field is the day, 3rd field is the time, 4th field is whatever, 5th field is a priority, 6th field is watever, 7th field is a random name.

Here is the command:

sort -t"_" +4 -5 +0 -1 +1 -2 +2 -3 +6 -8 file.list

And the result:

01_31_10:00_NNN_01_03_lkdjgldkfjgfdgjiod_001
01_31_10:21_NNN_01_03_dlksnfiunfzaiufskf_001
08_07_13:50_NNN_01_03_dslknfaocnbsudfhgs_001
12_05_15:02_NNN_01_03_psqiodjksjbfisuvbz_001
01_31_10:01_NNN_02_03_dslknfdjfhzeiuhbsd_001
12_05_12:01_NNN_02_03_dlskhvosnvoihvslfi_001
01_31_10:20_NNN_03_03_dlskfdksfusehbrksu_001
01_29_14:25_NNN_04_04_dsfihsdgnsdoiuvsdh_001
01_31_10:01_NNN_05_03_dsflkdfiouzniudbcz_001
08_07_13:21_NNN_05_03_dslkfhgkdlsgklfghd_001
01_31_10:01_NNN_06_03_fldkhdslkfkldfnsdl_001
01_31_10:21_NNN_08_03_sdlqkfhsnfiuqehfiu_001

I do understand that the file is sorted follwing the field 5, 1, 2, 3 and 8. But how. How does it work? And what are the "+" arguments?


Solution

  • ChatGPT helped me: It uses the old syntax of the 'sort' command.

    1. -t"_":

      • Specifies the field separator as the underscore (_). Each field in the line is separated by an underscore.
    2. +4 -5:

      • Sort based on the 5th field (fields are 0-indexed, so +4 means the 5th field).
      • -5 indicates to sort only based on the 5th field and no further fields for this sorting step.
    3. +0 -1:

      • If the values in the 5th field are identical, it sorts based on the 1st field.
    4. +1 -2:

      • If the values in the 5th and 1st fields are identical, it sorts based on the 2nd field.
    5. +2 -3:

      • If the values in the 5th, 1st, and 2nd fields are identical, it sorts based on the 3rd field.
    6. +6 -8:

      • Finally, if all previous fields are equal, it sorts based on the 7th and 8th fields combined (starting from the 7th field with +6 and ending at the 8th field with -8).

    What this command does

    The command sorts lines based on several fields separated by underscores (_), in this order of priority:

    1. 5th field (column after the 4th underscore), 2. 1st field, 3. 2nd field, 4. 3rd field, 5. 7th and 8th fields combined (sorting based on these two fields together).

    Equivalent modern syntax

    In modern versions of sort, this command would be written as:

    sort -t"_" -k5,5 -k1,1 -k2,2 -k3,3 -k7,8
    

    Here, each -k specifies which field to sort by, with start and end fields defined for each sorting step.

    Conclusion

    The command sorts lines using columns separated by underscores, first by the 5th field, then by the 1st, 2nd, 3rd, and finally by the 7th and 8th fields combined. It’s a way of sorting hierarchically based on multiple successive criteria.