phphtmlxpathdomxpath

How to handle multiple sub-elements in PHP DOMXPath?


I'd like to import an HTML document onto a MySQL database using PHP.

The structure of the document looks like this :

<p class="word">
<span class="word-text">word1</span>
<span class="grammatical-type">noun</span>
</p>
...
<p class="word">
<span class="word-text">word128</span>
<span class="grammatical-type">adjective</span>
</p>

For each word, I only have one word-text and one grammatical-type.

I'm able to find each word node, but for each of its children word-text and grammatical-type I'd like to perform a MySQL query :

$dom = new DOMDocument();
            $dom->loadHTMLFile($location);
            $xpath = new DomXPath($dom);
            $res = $xpath->query("//p[@class='word']");
            foreach ($res as $textNode) {
                //do something here for each *word-text*->nodeValue
                //do something here for each *grammatical-type*->nodeValue
                }
            }

I tried in the foreach loop to pass $textNode, which is a DOMNode, as a $contextNode as follows :

$wordText = $xpath->query("span[@class='word-text']", $textNode);
$myWord = $wordText->nodeValue;

But in $wordText I only have a DOMNodeList with a NULL nodeValue.

How can I, starting from the word node, manage the children nodes ?

Thanks


Solution

  • Solved.

    You just need to, as you know that the node only contains a single element, select this single element using item(0) :

    $dom = new DOMDocument();
                $dom->loadHTMLFile($location);
                $xpath = new DomXPath($dom);
                $res = $xpath->query("//p[@class='word']");
                foreach ($res as $textNode) {
                    $wordTextNode = $xpath->query("span[@class='word-text']", $textNode);
                    $word = $wordTextNode->item(0)->nodeValue;
    
                    //do same thing here for each *grammatical-type*
                    }
                }