Using gawk I want to process two files in a directory. The first file has a fixed name but whilst the start of the name of the second file is constant the name ends in a date and time stamp, the latter changes everytime the file is created. I want to use the latest version of the second file.
I have seen a post/answer to a similar but less complicated question at
how to pass the most recent file from a directory to awk input file?
and the code ls -lr 2nd_file_*| tail -n 1
does show me the latest file.
However I do not know how to pass the found file name to gawk as the second file.
Currently I type the date/time stamp into the gawk script e.g.
gawk -F[,"\t""}"] '{ do something }' file_1 2nd_file_2024_03_21_[18-21-32] > output_file
Does anyone know how I can do this ? Thanks.
I haven't tried anything as I haven't a clue how to.
Setting aside the various issues with parsing 'ls' output one simple approach would see the 2nd file/argument (to the awk
script) replaced with a subshell invocation of the ls|tail
call, eg:
awk '{ do something }' file_1 $( ls -1r 2nd_file_* | tail -n 1 )
NOTE: OP has stated this particular ls|tail
combo provides the desired file name so I'm merely copying it here as an example.
To see this in action we'll start with some sample files:
$ head *
==> 2nd_file_2024_03_21 <==
21
==> 2nd_file_2024_03_22 <==
22
==> 2nd_file_2024_03_23 <==
23
==> 2nd_file_2024_03_24 <==
24
==> file_1 <==
line_1
To obtain the latest 2nd_file_*
we need a tweak to OP's current ls|tail
:
$ ls -1 2nd_file_* | tail -n 1
2nd_file_2024_03_24
Wrapping this in subshell invocation and feeding to a simple awk
script that prints each input line to stdout:
$ awk '{ print }' file_1 $( ls -1 2nd_file_* | tail -n 1 )
line_1 # line from file_1
24 # line from 2nd_file_2024_03_24