phphtmlparsingdomsimpledom

Simple Dom Parser - stripping links and specific divs from the result


I am trying to parse some content from a specific div and save it to an external file. Although this works, I couldn't manage to do the following

From the div with class league_container

  1. remove all divs with the class bar
  2. strip all links. (leave the text but remove a plus its attributes)

What I have so far is:

   <?php
    include( 'simple_html_dom.php'); 
    $html = file_get_html('https://some.domain.com/');

    $divContents = array();

    foreach ($html->find('div.league_container') as $div) 
    {
        $divContents[] = $div->outertext;
    }       


file_put_contents('parser/est-results.htm', implode(PHP_EOL, $divContents));
?>

Any help would be appreciated.


Solution

  • use outertext = :

    $div->outertext = '';
    $a->outertext = $a->text();