phphtmlstringhtml-parsingtext-extraction

Get HTML markup after a specified tag


If I had the following text in a string and I didn't know what was inside the <h4> tag:

<h4>Tom</h4>
<p>One Paragraph</p>
<p>Two Paragraph</p>

What code would I need to parse that HTML string and get an output like this:

 <p>One Paragraph</p>
 <p>two Paragraph</p>

Solution

  • Use stripios to get the start of </h4>. Add the length of </h4> to the offset and then use substr to get all text after the offset.

    Example:

    $str = '....Your string...';
    $offset = stripos($str, '</h4>');
    if ( $offset === false ){
        //error, end of h4 tag wasn't found
    }
    $offset += strlen('</h4>');
    $newStr = substr($str, $offset);
    

    I should point out that if the HTML gets any more complex or you don't control the HTML, you may want to use a HTML parser. It is much more robust and less likely to fail if it (for example) encounters < /h4 > rather than </h4>. However, in this case it is overkill.