While working with a Twitter Search RSS feed in Yahoo Pipes, I'm trying to clean up long Twitter links and replace them with their shortened versions. To that effect I want to match any link text that is NOT on a Twitter domain. Usually, those are t.co links.
Here's an example of what I want to do:
turn
<a href="http://t.co/AiyTQKaAoU">http://www.denverpost.com/environment/ci_26064841/colorado-coal-mine-mulls-appeal-after-federal-court ...</a>
into
<a href="http://t.co/AiyTQKaAoU">http://t.co/AiyTQKaAoU</a>
My regex started as <a .*?href=['""](.+?)['""].*?>(.+?)</a>
which matched all links.
Then I tried <a .*?href=['""]!(www\.twitter\.com\/?)['""].*?>(.+?)</a>
to remove twitter.com from the results, but it's not working. What I doing wrong?
P.S. I need to not touch Twitter links because that will mess up all '@' and '#' links.
Addition: Solution by @Avinash-Raj works in the demo but not inside the Yahoo Pipe. Anyone familiar with regex inside Yahoo Pipes?
In Yahoo Pipes, something like this should do:
href="(http://t.co[^"]*)"[^>]*>http://[^<]*
href="$1">$1
Here's a demo pipe, and here's another, based on your pipe.
PS: you know you can put multiple regex replacements in a single Regex operator. It's easier to read that way.