Consider the following list of URLs:
1 http://www.cnn.com/international/stories/423423532
2 http://www.traderscreener.com/blah
3 http://is. gd/fsdaGdfd3
4 http://bit. ly/54HFD
5 http://stackoverflow.com/question/ask
I would like to expand shortened URLs to their original form:
$headers = get_headers($URL, 1);
if (!empty($headers['Location'])) {
$headers['Location'] = (array) $headers['Location'];
$URL = array_pop($headers['Location']);
}
However, I need to match all URLs against an array of shortening services:
$array(
'is.gd', 'bit.ly', 'wibi.us', 'tinyurl.com' // etc
)
In this case, this would have to filter out URLs 3, 4, and 5. I believe the most easy way of doing this would be to grab ***
in http://***/blah
. Since I have little experience using regex, what would be the regex needed? Or is there a better way of approaching this?
By far the easiest way to do this is not to build a blacklist. Instead, query the URL and see if it redirects. Send a HEAD request, and look for the status code. If it's 3xx, then there's a redirect so you should look for the "Location" header and use that as the new URL.