I have many files, each in a directory. My script should:
Find a string in a file. Let's say the file is called "results" and the string is "average."
Then append everything else on the string's line to another file called "allResults." After running the script, the file "allResults" should contain as many lines as there are "results" files, like
allResults.txt (what I want):
Everything on the same line as the string, "average" in directory1/results
Everything on the same line as the string, "average" in directory2/results
Everything on the same line as the string, "average" in directory3/results
...
Everything on the same line as the string, "average" in directory-i/results
My script can find what I need. I have checked by doing a "cat" on "allResults.txt" as the script is working and an "ls -l" on the parent directory of "allResults.txt." I.e., I can see the output of the "find" on my screen and the size of "allResults.txt" increases briefly, then goes back to 0. The problem is that "allResults.txt" is empty when the script has finished. So the results of the "find" are not being appended/added to "allResults.txt." They're being overwritten. Here is my script (I use "gsed", GNU sed, because I'm a Mac OSX Sierra user):
#!/bin/bash
# Loop over all directories, find.
let allsteps=100000
for ((step=0; step <= allsteps; step++)); do
i=$((step));
findme="average"
find ${i}/experiment-1/results.dat -type f -exec gsed -n -i "s/${findme}//p" {} \; >> allResults.txt
done
Please note that I have used ">>" in my example here because I read that it appends (which is what I want--a list of all lines matching my "find" from all files), whereas ">" overwrites. However, in both cases (when I use ">" or ">>"), I end up with an empty allResults.txt
file.
grep's default behavior is to print out matching lines. Using sed is overkill.
You also don't need an explicit loop. Indeed, excess looping is a common trope programmers tend to import from other languages where looping is common. Most shell commands and constructs accept multiple file names.
grep average */experiment-1/results.dat > allResults.txt
What's nice about this is the output file is only opened once and is written to in one fell swoop.
If you indeed have hundreds of thousands of files to process you might encounter a command-line length limit. If that happens you can switch to a find
call which will make sure not to call grep with too many files at once.
find . -name results.dat -exec grep average {} + > allResults.txt