phpregexhtml-parsingsrcabsolute-path

Replace src attribute values from relative URLs to absolute URLs


Here's a simple preg_replace() call:

$string = 'src="index.php';
$string = preg_replace("/(src=('|\")+[^(http:|https:)])/i", "src=\"http://example.com/", $string);
echo $string;

I expect the result to be src="http://example.com/index.php but it turns out to be src="http://example.com/ndex.php.

I must be missing something here..


Solution

  • That's a really messed up regex. What are you trying to achieve exactly? It looks like if the URL doesn't start with http or https you want to add the domain? If so, you're quite a bit off:

    $string = preg_replace('/src=(\'|")?(?!htts?:)/i', 'src="http://domain.com/');
    

    should be a lot closer to the mark.

    What does this regex do? It looks for:

    Note: {?!...) is called a negative lookahead and is one example of a zero-width assertion. "Zero-width" here means that it doesn't consume any of the input. In this case it means "not followed by ...".

    What does your regex do? It looks for:

    Note:

    [^(http:|https:)]
    

    is equivalent to:

    [^():https]
    

    meaning any character that is not one of those characters.