I want to loop through multiple subdirectories and parse a specific file (DOE.nas) that exists in all subdirectories. The parsing is to print the 4th columns into one output, side by side. Below is an example for just 2 subfolders.
(DOE.nas) in dir_1
PSHELL 217 136738 0.7 136738
PSHELL 1786 13571 1.4 13571
PSHELL 1605 136513 0.65 136513
PSHELL 1623 13571 1.4 13571
(DOE.nas) in dir_2
PSHELL 1628 136733 1.3 136733
PSHELL 2015 136514 0.75 136514
PSHELL 2304 136513 1.5 136513
PSHELL 2509 13571 1.2 13571
Desired output (out.txt)
0.7 1.3
1.4 0.75
0.65 1.5
1.4 1.2
Here is my code and the error:
for FILE in */*.nas; do
awk -v OFS='\t' '{print $4}' "$FILE" > tmp
paste tmp >> out.txt
done
My out.txt is stacked up
0.7
1.4
0.65
1.4
1.3
0.75
1.5
1.2
Addressing OP's comment about having to process more than 2 files ...
Assumptions:
Setup:
$ head dir_*/DOE.nas
==> dir_1/DOE.nas <==
PSHELL 217 136738 0.7 136738
PSHELL 1786 13571 1.4 13571
PSHELL 1605 136513 0.65 136513
PSHELL 1623 13571 1.4 13571
==> dir_2/DOE.nas <==
PSHELL 1628 136733 1.3 136733
PSHELL 2015 136514 0.75 136514
PSHELL 2304 136513 1.5 136513
PSHELL 2509 13571 1.2 13571
==> dir_3/DOE.nas <==
PSHELL 1628 136733 17.3 136733
PSHELL 2015 136514 20.75 136514
PSHELL 2304 136513 31.5 136513
PSHELL 2509 13571 16.2 13571
==> dir_4/DOE.nas <==
PSHELL 1628 136733 7.54 136733
PSHELL 2015 136514 12.3 136514
PSHELL 2304 136513 24.55 136513
PSHELL 2509 13571 -1.2 13571
One awk
idea:
awk '
BEGIN { OFS="\t" }
{ lines[FNR]= lines[FNR] (FNR==NR ? "" : OFS) $4 }
END { for (i=1; i<=FNR; i++)
print lines[i]
}
' dir_*/DOE.nas
NOTE: this replaces all of OP's current code (for / awk / paste
)
This generates:
0.7 1.3 17.3 7.54
1.4 0.75 20.75 12.3
0.65 1.5 31.5 24.55
1.4 1.2 16.2 -1.2