I want to iterate over all folders and their subfolders and print the names of the .TXT files (in the subfolders) whose first line contains the string CYCLE DATE (there may be spaces and/or underscores between CYCLE and DATE). Here's my attempt at solving this:
In files_and_folders.sh I entered this:
#!/bin/bash
find . -name '*.TXT' -exec awk 'NR == 1 && $0 ~ /CYCLE[_ ]+DATE/ { print FILENAME }'
At the bash command line I entered this:
bash files_and_folders.sh
That produced the following error message:
find: missing argument to -exec
What is the correct way to do this?
I'd split this problem like this:
CYCLE DATE
So,
#!/bin/bash
# Don't error on no file name matches:
shopt -s nullglob
# Enable recursive ** glob:
shopt -s globstar
for file in **/*.TXT ; do
# first line only # look for regex # print file name
# # -q: silently #
# -n 1: one line # -E: extended regexes #
head -n 1 "${file}" | grep -q -E 'CYCLE[_ ]+DATE' && echo "${file}"
# or your elegant:
# awk 'NR == 1 && $0 ~ /CYCLE[_ ]+DATE/ { print FILENAME }' "${file}"
done
Of course, instead of grep
you can use awk
to analyze your line, but frankly, that's unnecessarily complex here. Your regular expression is very simple (CYCLE, then "space" (at least once), then DATE), so a simple regex engine like grep can do the job.
The problem with your find
is that you use neither ';'
nor '{}'
after -exec
, so find
can't understand where the command it should execute is done (or where it should put the file it found when doing the invocation).
But since this doesn't even need find
and can be done completely without, I'd personally say for file in GLOB; do … done
is easier to remember than find -name 'PATTERN' -exec Some complicated syntax '{}' ';'
.