I have 2 files with the format of
File A
01 20200111 28.56 22.07 40.14 49.79 22.81 49.31 33.75 31.24 39.41 36.18
02 20200118 32.41 14.89 38.82 60.54 11.54 49.10 34.34 25.53 36.96 34.30
03 20200125 21.95 18.48 28.45 42.94 22.45 42.40 36.43 34.53 43.51 33.28
...
01 20210109 29.32 24.60 34.41 46.56 29.38 44.06 34.47 33.75 41.12 36.04
02 20210116 29.38 28.39 33.07 42.63 29.46 39.41 32.45 33.60 43.81 34.93
03 20210123 27.51 19.55 32.98 45.88 26.05 46.88 37.58 31.98 44.52 35.90
File B
01 19912020 24.20 16.70 31.28 45.98 19.26 42.57 33.43 29.35 39.96 32.22
02 19912020 24.29 15.46 29.02 44.47 19.24 40.57 32.98 29.80 40.26 31.36
03 19912020 20.69 14.07 29.90 45.89 21.34 44.71 35.90 31.61 42.00 33.08
I would like to take the difference of Columns 3-12 of each id (Column 1) in File A from the same unique id in File B and add it at the end of each line in File A. File A has multiple instances of the same id. So each instance would be compared to the same id in File B.
My desired output is:
01 20200111 28.56 22.07 40.14 49.79 22.81 49.31 33.75 31.24 39.41 36.18 4.36 5.37 8.86 3.81 3.55 6.74 0.32 1.89 -0.55 3.96
02 20200118 32.41 14.89 38.82 60.54 11.54 49.10 34.34 25.53 36.96 34.30 8.12 -0.57 9.8 16.07 -7.7 8.53 1.36 -4.27 -3.3 2.94
03 20200125 21.95 18.48 28.45 42.94 22.45 42.40 36.43 34.53 43.51 33.28 1.26 4.41 -1.45 -2.95 1.11 -2.31 0.53 2.92 1.51 0.2
...
01 20210109 29.32 24.60 34.41 46.56 29.38 44.06 34.47 33.75 41.12 36.04 5.12 7.9 3.13 0.58 10.12 1.49 1.04 4.4 1.16 3.82
02 20210116 24.29 15.46 29.02 44.47 19.24 40.57 32.98 29.80 40.26 31.36 5.09 12.93 4.05 -1.84 10.22 -1.16 -0.53 3.8 3.55 3.57
03 20210123 20.69 14.07 29.90 45.89 21.34 44.71 35.90 31.61 42.00 33.08 6.82 5.48 3.08 -0.01 4.71 2.17 1.68 0.37 2.52 2.82
The above output (cols 3-22) needs to be in the following format (%8.2f)
XX 20230708 74.05 68.12 76.00 81.01 63.48 81.06 73.59 67.78 74.86 73.59 5.60 -1.14 1.22 2.50 -4.27 0.37 0.03 5.25 1.74 0.61
I attempted to use something like the below (I edited the code provided by markp-fuso(thanks!)) but the output appears to be alternating b/t File A and File B for each ID.
awk '
FNR==NR { for (i=3;i<=NF;i++) # 1st file: loop through 3rd-12th fields
b[$1][i]=$i # store field values in 2-dimensional array
next # skip to next input line
}
$1 in b { printf "%s %s%s", $1, OFS, $2 # 2nd file: if $1 is index in first dimension of array b[] then printf first 2 fields and then ...
for (i=3;i<=NF;i++) # loop through 3rd-12th fields
printf "%s%8.2f%8.2f", OFS, $i,$i-b[$1][i] # print the difference
print "" # terminate the current line of output
}
Any help is appreciated.
Assumptions/understandings based on comments from OP's previous question as well as measurements of the desired output:
0
of the 1st column%8.2f
(even though inputs and desired outputs appear to be formatted to %7.2f
; output diffs also vary between .2f
and .1f
)Modifying OP's current awk
attempt:
awk '
FNR==NR { for (i=3;i<=NF;i++)
b[$1][i]=$i
next
}
$1 in b { diffs = "" # initialize our string of diffs
printf "%2s %s", $1, $2 # hardcode the 2x spaces; no need to reference OFS
for (i=3;i<=NF;i++) { # loop through 3rd-12th columns
printf "%8.2f", $i # print current column
diffs = diffs sprintf("%8.2f", $i-b[$1][i]) # append new diff to "diffs" string
}
print diffs # print "diffs" string and terminate current line of output
}
' File_B File_A
This generates:
01 20200111 28.56 22.07 40.14 49.79 22.81 49.31 33.75 31.24 39.41 36.18 4.36 5.37 8.86 3.81 3.55 6.74 0.32 1.89 -0.55 3.96
02 20200118 32.41 14.89 38.82 60.54 11.54 49.10 34.34 25.53 36.96 34.30 8.12 -0.57 9.80 16.07 -7.70 8.53 1.36 -4.27 -3.30 2.94
03 20200125 21.95 18.48 28.45 42.94 22.45 42.40 36.43 34.53 43.51 33.28 1.26 4.41 -1.45 -2.95 1.11 -2.31 0.53 2.92 1.51 0.20
01 20210109 29.32 24.60 34.41 46.56 29.38 44.06 34.47 33.75 41.12 36.04 5.12 7.90 3.13 0.58 10.12 1.49 1.04 4.40 1.16 3.82
02 20210116 29.38 28.39 33.07 42.63 29.46 39.41 32.45 33.60 43.81 34.93 5.09 12.93 4.05 -1.84 10.22 -1.16 -0.53 3.80 3.55 3.57
03 20210123 27.51 19.55 32.98 45.88 26.05 46.88 37.58 31.98 44.52 35.90 6.82 5.48 3.08 -0.01 4.71 2.17 1.68 0.37 2.52 2.82