phpsymfonydomcrawler

extract untagged elements with symfony dom crawler


How to extract untagged elements with symfony dom crawler. For example in the sample html below I want to extract Hello World.

<strong>title</strong>Hello World<strong>Sub-Title</strong><div>This is just stuff</div>

Solution

  • You could do this easily with PHP DOM ;)

    $dom = new DOMDocument();
    $dom->loadHTML('<strong>title</strong>Hello World<strong>Sub-Title</strong><div>This is just stuff</div>');
    $xpath = new DOMXPath($dom);
    // use the fact that PHP DOM wraps everything into the body and get the text()
    $entries = $xpath->query('//body/text()');
    foreach ($entries as $entry) {
        echo $entry->nodeValue;
    }