phpwordpresswordpress-plugin-creation

How to target non-empty p elements in PHP?


I am developing a plugin for my WordPress site. I want to select all non-empty paragraph elements.

Here is my code :



function my_php_custom_function($content){

 // Create a new DOMDocument instance
 $dom = new DOMDocument();

 // Load the HTML content into the DOMDocument
 $dom->loadHTML(mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

 // Create a DOMXPath object to query the DOM
 $xpath = new DOMXPath($dom);

 // Find all non-empty p elements in the content
 $p_elements = $xpath->query('//p[string-length(normalize-space()) > 0]');
}


add_filter('the_content','my_php_custom_function')

$p_elements in this variable I am getting those paragraphs also which I have just created by pressing enter. When I check on DOM, it is showing as <p>&nbsp;</p>


Solution

  • You're likely using some sort of WYSIWYG editor for your content, which in some cases produce elements only containing &nbsp;

    To get non-empty P elements and also ignoring P elements containing only &nbsp; your XPath could look like the following:

    //p[normalize-space() and not(normalize-space(.) = '&nbsp;')]
    

    Updated answer:

    Apparently, the representation in the DOMDocument of the &nbsp; converts fully (via bin2hex() to c2a0. Using this knowledge, we can input it as the hexidecimal conversion instead (\xC2\xA0).

    This would render your query to look somewhat like the following:

    $p_elements = $xpath->query('//p[normalize-space() and not(normalize-space(.) = "'."\xC2\xA0".'")]');
    

    While not pretty (due to all the escaping), it works in my small tests.