regexperlsedmailto

How do I replace an arbitrary number of backreferences in sed or Perl? (for obfuscating mailto)


I'm looking for a way to obfuscate mailtos in the source code of a web site. I'd like to go from this:

href="mailto:president@whitehouse.gov"

To this:

href="" onmouseover="this.href='mai'+'lto:'+'pre'+'sid'+'ent'+'@wh'+'ite'+'hou'+'se.'+'gov'"</code>

I'm probably going to go with a PHP solution instead, like this (that way I only have to globally replace the entire mailto, and the source on my end will look better), but I spent too much time looking at sed and Perl and now I can't stop thinking about how this could be done! Any ideas?

Update: Based heavily on eclark's solution, I eventually came up with this:

#!/usr/bin/env perl -pi
if (/href="mailto/i) {
    my $start = (length $`) +6;
    my $len = index($_,'"',$start)-$start;
    substr($_,$start,$len,'" onmouseover="this.href=' .
    join('+',map qq{'$_'}, substr($_,$start,$len) =~ /(.{1,3})/g));
}

Solution

  • Building on Sinan's idea, here's a short perl script that will process a file line by line.

    #!/usr/bin/env perl -p
    
    my $start = index($_,'href="') +6;
    my $len = index($_,'"',$start)-$start;
    substr($_,$start,$len+1,'" onmouseover="this.href=' .
      join('+',map qq{'$_'}, substr($_,$start,$len) =~ /(.{1,3})/g)
    );
    

    If you're going to use it, make sure you have your old files committed to source control and change the -p option to -i, which will rewrite a file in place.