<h2>t1</h2>
<strong>s1</strong>
</hr>
<h2>t2</h2>
<p><strong>s2</strong></p>
<strong>s3</strong>
<strong>s4</strong>
<h2>t3</h2>
<strong>s5</strong>
'h2' tags are followed by unknown number of 'strong' tags and 'p' tags, some times a 'strong' is embeded in 'p'. Sometimes, other tags, such as 'hr', exist. Is there a way to retrieve all 'h2', and each 'h2' is followed by the first 'strong' after the 'h2'. For example, for the above code, I would like to get:
t1
s1
t2
s2
t3
s5
I tried to get all 'h2' in one array, and all 'strong' in another, but I could not find which 'strong' is the first one that is following an 'h2'.
You can use the following ways:
include_once 'simple_html_dom.php';
$nodes = str_get_html('<h2>t1</h2>
<strong>s1</strong>
</hr>
<h2>t2</h2>
<p><strong>s2</strong></p>
<strong>s3</strong>
<strong>s4</strong>
<h2>t3</h2>
<strong>s5</strong>')->nodes;
Get the sequential list
$list = [];
foreach($nodes[0]->children as $child) {
if($child->tag == 'h2' || $child->tag == 'strong') {
$list[] = $child->innertext;
}
}
result
array (size=7)
0 => string 't1' (length=2)
1 => string 's1' (length=2)
2 => string 't2' (length=2)
3 => string 's3' (length=2)
4 => string 's4' (length=2)
5 => string 't3' (length=2)
6 => string 's5' (length=2)
Get nested list
$nested = [];
$a = -1;
foreach($nodes[0]->children as $child) {
if($child->tag == 'h2') {
$a++;
$nested[$a]['value'] = $child->innertext;
} elseif($child->tag == 'strong') {
$nested[$a]['children'][] = $child->innertext;
}
}
Result
array (size=3)
0 =>
array (size=2)
'value' => string 't1' (length=2)
'children' =>
array (size=1)
0 => string 's1' (length=2)
1 =>
array (size=2)
'value' => string 't2' (length=2)
'children' =>
array (size=2)
0 => string 's3' (length=2)
1 => string 's4' (length=2)
2 =>
array (size=2)
'value' => string 't3' (length=2)
'children' =>
array (size=1)
0 => string 's5' (length=2)