bashsedcommand-linegnu-findutils

sed on many files - can we do better than invoke-once-per-file?


I have a set of > 100,000 files to which I want to apply a sed script. Reading the accepted answer to this question:

Applying sed with multiple files

I see the suggestion involves invoking sed for every single one of the files one is interested in, e.g.:

find $root_path -type f -name "whatever" -exec sed -i "somecommands" {} \;

but while this works - it's a bit silly. After all, sed is willing to work on many files, e.g.:

sed -i "somecommads" input_file_1 input_file_2

So, I would tend to prefer:

sed -i "somecommads" $(find $root_path -type f -name "whatever")

and this does work.

... except when you have a lot of files. Then, bash tells you "argument list is too long".

Is there an alternative to these two approaches for applying the same sed script to tends, or hundreds, of thousands of files?


Solution

  • Do:

    find $root_path -type f -name "whatever" -print0 | xargs -0 sed -i "somecommads"
    

    The -print0 argument to find causes file paths to be printed with a trailing \0 character rather than a newline, and the corresponding -0 argument to xargs makes it use \0 as the separator on its input. This will allow for filenames which contain a newline.