phphtmldomdocumentdomxpath

PHP DomDocument query in results


Yesterday I discovered the php DomDocument and I realized that could help me a lot (way faster - I was using php and regex) to get values from html.

I'm stuck and I can't find a solution (probably my way of thinking) - searched for over 16 hours now and still no solution.

$res = '
<html>
<div class="product bla bla">
<div class="size">xxl</div>
<div class="color yy">red</div>
</div>

<div class="product bla">
<div class="size xxs">xxs</div>
<div class="line line2">new</div>
</div>

<div class="product asd">
<div class="color xx">blue</div>
</div>
</html>
';

$dom = new DomDocument();
@ $dom->loadHTML($res); // utf8_decode
$dom->preserveWhiteSpace = false; // ?
$xpath = new DomXPath($dom);
$nodes = $xpath ->query("//*[contains(@class, 'product')]");

echo "Found {$nodes->length} matching places" . PHP_EOL;

foreach($nodes  as $node) {
    $name = $xpath->query("//*[contains(@class, 'color')]", $node);
    echo "<br>";
    echo $name[0]->nodeValue . PHP_EOL;
}

Result:

Found 3 matching places
red
red
red 

Expected result would be:

Found 3 matching places
red
Null / empty or so...
blue

My question would be how do I search in search result (preserving the index ex. [0] - red, [1] - null, 2 - [blue] )

Also if you know a good place to learn about DomDocument let me know

Thank you for your help or hints.


Solution

  • Modify your last lines as:

    foreach($nodes  as $node){
         // do not use `//` as it means "from the root of the document", 
         // and not from the root of the node you provided
         $name = $xpath->query("*[contains(@class, 'color')]", $node);
         // check if name[0] node exists
         echo null === $name[0] ? 'NULL' : $name[0]->nodeValue;
         echo PHP_EOL;
    }
    

    Fiddle is here.