phpregexpreg-replaceautolink

Preg_replace - regular expression - PHP "autolink"


I have a piece of code that turn single words or phrases into clickable internal links from a given list. Code is supposed to do this action only if the word or phrase is not linked yet. It is working very great actually except for one point: Code is considering name from src images attribute.

So,

<img src="img/xiaomi.jpg" />

is outputting

<img src="img/<a href="site.com/tag/xiaomi">Xiaomi</a>.jpg" />

As you can see probably regex is too greedy and getting what is not to get.

Code is modified to simplicity but is used as follows:

$content     = 'All post content itself with all html tags a site can have. <p>Blabla</p> <img src="img/xiaomi.jpg" /> <p>Bliblibli</p> <p>Lorem ipsum xiaomi</p>';
$contentCopy = 'All post content itself with all html tags a site can have. <p>Blabla</p> <img src="img/xiaomi.jpg" /> <p>Bliblibli</p> <p>Lorem ipsum xiaomi</p>';

$list = $this->cache->get('wordsList');

foreach($list as $word){
    $var = $word->word;
    $text = preg_replace('/<a[\S\s]+?<\/a>(*SKIP)(*FAIL)|\b'.$var.'\b/i', '<a href="'.base_url('site/tag/'.url_title($var)).'" target="_blank" title="'.ucfirst($var).'">$0</a>', $text);
}
$content = str_replace($contentCopy,$text,$content);

Can you guys please help to improve this code?

Apparently the problem is only in image tags.

I use this snippet to auto create internal links for stored pages and help on site SEO


Solution

  • You may replace <a[\S\s]+?<\/a> with (?:<a[\S\s]+?<\/a>|<img\b[^>]*>). Here is a variation with a . and s modifier instead of [\s\S]:

    '~(?:<a.*?</a>|<img\b[^>]*>)(*SKIP)(*FAIL)|\b'.$var.'\b~si'
    

    Quick details:

    See the regex demo.