phphtmlhtml-parsingtext-extractiondomparser

Get text from all <li> tags which also include <a> tags


I have a few <li> tags inside a <div> like this:

<li> <a href="link1"> one <li>
<li> <a href="link2"> two <li>
<li> <a href="link3"> three <li>

How can I get the text two using HTML DOM parser and then put it inside an array to use later?


Solution

  • You need to make sure the a tag is closed then you can do it like this:

    <?php 
    $html = '<li> <a href="link1"> one </a> <li>
    <li> <a href="link2"> two </a> <li>
    <li> <a href="link3"> three </a> <li>
    ';
    
    // Create a new DOM Document
    $xml = new DOMDocument();
    
    // Load the html contents into the DOM
    $xml->loadHTML($html);
    
    // Empty array to hold all links to return
    $result = array();
    
    //Loop through each <li> tag in the dom
    foreach($xml->getElementsByTagName('li') as $li) {
        //Loop through each <a> tag within the li, then extract the node value
        foreach($li->getElementsByTagName('a') as $links){
            $result[] = $links->nodeValue;
        }
    }
    //Return the links
    print_r($result);
    /*
    Array
    (
        [0] =>  one 
        [1] =>  two 
        [2] =>  three 
    )
    
    */
    ?>
    

    Its all in the manual for domDocument