This html is within my page:
<tr>
<td class="padded2" bgcolor="#103A74"><font color="White">Refine by Vehicle Types</font></td>
</tr><tr>
<td class="padded2" bgcolor="White"><div>
<table border="0">
<tr>
<td class="padded2"><font color="#103A74"><ul><li><a class="padded2"> Cars</a></li><li><a class="padded2">Marine Engines</a></li><li><a class="padded2">Trucks</a></li></ul></font></td>
</tr>
</table>
</div></td>
</tr>
I'm wanting to scrape "Cars" and "Trucks" based on the fact that they are after "Refine by Vehicle Type". I've tried many diferent ways and this is as close as I can get, but returns NULL.
$Nodes = $xPath->query("//tr/td/font[text()[contains(., 'Refine by Vehicle Type')]]/following-sibling::tr/td/div/table/tr/td/font/ul/li/a")->item(0)->nodeValue;
What am I missing?
Your error is in this:
...font[...]/following-sibling::tr/...
It is easy to see that in the provided XML fragment, the <font>
element has no sibling elements.
Here is one correct XPath expression:
tr[td[contains(., 'Refine by Vehicle Types')]]
/following-sibling::tr
/td/div/table
/tr/td/font
/ul/li/a
When evaluated against the following XML document (your provided fragment wrapped by a <table>
):
<table>
<tr>
<td class="padded2" bgcolor="#103A74">
<font color="White">Refine by Vehicle Types</font>
</td>
</tr>
<tr>
<td class="padded2" bgcolor="White">
<div>
<table border="0">
<tr>
<td class="padded2">
<font color="#103A74">
<ul>
<li>
<a class="padded2"> Cars</a>
</li>
<li>
<a class="padded2">Marine Engines</a>
</li>
<li>
<a class="padded2">Trucks</a>
</li>
</ul>
</font>
</td>
</tr>
</table>
</div>
</td>
</tr>
</table>
the following elements are selected:
<a class="padded2"> Cars</a>
<a class="padded2">Marine Engines</a>
<a class="padded2">Trucks</a>
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/*">
<xsl:copy-of select=
"tr[td[contains(., 'Refine by Vehicle Types')]]
/following-sibling::tr
/td/div/table
/tr/td/font
/ul/li/a
"/>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the XML document above, the selected elements are output:
<a class="padded2"> Cars</a>
<a class="padded2">Marine Engines</a>
<a class="padded2">Trucks</a>
I would recommend using an XPath Visualizer to get quickly up with writing correct and elegant XPath expressions.