Given Html -
<div id="testid">
<h1>Test Title</h1>
<ul class="clearfix">
<li class="anker" id="artists-A"></li>
<li class="first">
<a href="www.test1.html" title="Test1">
<span>
<img src="https://www.test1.de/img/test1.jpg" alt="Test1" />
<span>Test1</span>
</span>
</a>
</li>
<li>
<a href="www.test2.html" title="Test2">
<span>
<img src="https://www.test2.de/img/test2.jpg" alt="Test2" />
<span>Test2</span>
</span>
</a>
</li>
<li class="first">
<a href="www.test3.html" title="Test3">
<span>
<img src="https://www.test1.de/img/test3.jpg" alt="Test3" />
<span>Test3</span>
</span>
</a>
</li>
</ul>
</div>
Need to get a href value,img src and span ie Title . I am parsing this using domDocument but not getting exact result.
Code:
$doc = new DomDocument;
$doc->validateOnParse = true;
$doc->loadHtml(file_get_contents($url));
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//[@id="testid"]/ul/li');
Here we are using DOMDocument
. For now i am gathering a
's href
and img
's src
, you can add further more tags you want.
$domDocument = new DOMDocument();
$domDocument->loadHTML($string);
$domXPath = new DOMXPath($domDocument);
$results = $domXPath->query("//div[@id='testid']");//querying div with id="testid"
$results = $domXPath->query("//a|//img",$results->item(0));//querying resultant div for a and img
$data=array();
foreach($results as $result){
if($result->tagName=="a")//checking for anchor tags
{
$data["a"][]=array(
"href"=>$result->getAttribute("href"),
"title"=>$result->getAttribute("title")
);
}
elseif($result->tagName=="img")//checking for image tags
{
$data["img"][]=$result->getAttribute("src");
}
}
print_r($data);