phpregexwordpresstinymcepreg-replace

preg_replace regular expression HTML


I am using shortcodes in WordPress. After each shortcode output (closing div) I got <br> (or <br />) tag. trying to filter them out, but I don't know how. Generated HTML looks like

<div class="fullwidth"><!-- 1st shortcode-->
<div class="fullwidth-content">
  <!-- 2nd shortcode-->
  <div class="twocol-one"> content
  </div><br>
</div><br>
<!-- 3rd shortcode-->
<div class="twocol-second"> content
</div><br>
<div class="clearboth"></div>
</div><br>

seems BR is newline from tinyMCE. And I don't want loooong shotcode lines.

I am trying to use preg_replace but i cannot create correct $pattern.

Can you help me?

my function

function replace_br($content) {
$rep = preg_replace("/<\/div>\s*<br\s*\/?>/i", "</div>",$content);
return $rep; }
add_filter('the_content', 'replace_br');

not working.

While using $rep = preg_replace("/\s*<br\s*\/?>/i", "",$content); in function, all BRs are replaced. Fine, but i want to replace only BRs after closing DIV tag.

str_replace("</div><br>", "</div>", $content); also not working.

What's wrong with my function?

No error returned.


Solution

  • You are doing it wrong in the first place, since you have to remove the tags. You are doing it wrong because you're using regex for HTML (sometimes it's OKish).

    Variation of regex you're using should suffice: Demo

    You should really consider using DOMDocument or similar:

    $html = <<<HTML
    ...
    HTML;
    
    $dom = new DOMDocument();
    
    $dom->loadHTML($html);
    
    $element = $dom->getElementsByTagName('br');
    
    $remove = [];
    foreach($element as $item){
      $remove[] = $item;
    }
    
    foreach ($remove as $item) {
      $item->parentNode->removeChild($item); 
    }
    
    $html = $dom->saveHTML();
    
    echo $html;
    

    This would remove all of br, you would need to adjust the code work for your specs, but this should be a pointer.