bashfilesediognu-sed

Output file empty for Bash script that does "find" using GNU sed (gsed)


I have many files, each in a directory. My script should:

allResults.txt (what I want):

Everything on the same line as the string, "average" in directory1/results
Everything on the same line as the string, "average" in directory2/results
Everything on the same line as the string, "average" in directory3/results
...
Everything on the same line as the string, "average" in directory-i/results

My script can find what I need. I have checked by doing a "cat" on "allResults.txt" as the script is working and an "ls -l" on the parent directory of "allResults.txt." I.e., I can see the output of the "find" on my screen and the size of "allResults.txt" increases briefly, then goes back to 0. The problem is that "allResults.txt" is empty when the script has finished. So the results of the "find" are not being appended/added to "allResults.txt." They're being overwritten. Here is my script (I use "gsed", GNU sed, because I'm a Mac OSX Sierra user):

#!/bin/bash

# Loop over all directories, find.
let allsteps=100000
for ((step=0; step <= allsteps; step++)); do
    i=$((step));

    findme="average"
    find ${i}/experiment-1/results.dat -type f -exec gsed -n -i "s/${findme}//p" {} \; >> allResults.txt
done 

Please note that I have used ">>" in my example here because I read that it appends (which is what I want--a list of all lines matching my "find" from all files), whereas ">" overwrites. However, in both cases (when I use ">" or ">>"), I end up with an empty allResults.txt file.


Solution

  • grep's default behavior is to print out matching lines. Using sed is overkill.

    You also don't need an explicit loop. Indeed, excess looping is a common trope programmers tend to import from other languages where looping is common. Most shell commands and constructs accept multiple file names.

    grep average */experiment-1/results.dat > allResults.txt
    

    What's nice about this is the output file is only opened once and is written to in one fell swoop.

    If you indeed have hundreds of thousands of files to process you might encounter a command-line length limit. If that happens you can switch to a find call which will make sure not to call grep with too many files at once.

    find . -name results.dat -exec grep average {} + > allResults.txt