phpregexweb-scrapingdomdocument

DomDocument fetch h1 tag


I have very very big html page/data. I need to fetch data under h1 tag. so what I have seen on various example is DOMDOCUMENT is basically used for parsing xml. but if i have html data, very meshu, and I want to fetch text under <h1></h1> tag then what will be code.

If there are number of <h1> tags

$doc = new DOMDocument();

@$doc->loadHTML($this->siteHtmlData);

$aElements = $doc->getElementsByTagName("h1")

Please help me.

Thanks


Solution

  • You could loop it to get the value:

    foreach ($aElementsas as $node) {
        echo $node->nodeValue, PHP_EOL;
    }