awksedstring-substitution

Substitute multiple strings from a file over multiple files


Let's say I'm George Orwell and I want to replace all instances of "big" with "small", "rich" with "poor", "smart" with "stupid", etc across a bunch of files. So I create a text file with one line per substitution:

file: substs.csv

big, small
rich, poor
smart, stupid

Now I want to apply those substitutions in substs.csv globally across a bunch of files. I assume this would use a sed script. Note that I'm happy to format substs.csv to have any format, as long as its one substitution pair per line.

What's the right tool, and what's the script that will do this?

Edit 1: It's fine to operate on just one file at a time. I can do foreach or equivalent...

Edit 2: I can guarantee that substitutions on the right hand side don't appear on the left hand side, i.e., order of operation won't matter.

[I'm tempted to just bust out python and do it there. But this is a chance to refresh my unix tools chops...]


Solution

  • As Kamil said in the comments, there's probably a million different ways to stroke that cat ...

    One that sprang into my warped mind was:

    find -type f -name "*txt" -exec $(awk -F", *" 'BEGIN{printf "sed -i.bk "}{printf "-e s/%s/%s/g ", $1,$2}END{printf "\n"}' substs.csv) {} \;
    

    Basically I'm building the sed-command on the fly (using your substs.csv and awk) and then use that via find to modify any files that end in .txt. Your selection criteria may be wider, you may not want backups of the files (et rid of the .bk in "sed -i.bk ") ... but it does what you're trying to achieve.