Is there a way, when using DOM_Document Xpath to search in reverse (from the end of the page moving up instead of from the top down?) If so, how would I do this?
I am doind a scrape of a web site. (linked below). http://www.sturmfh.com/obit-display.jhtml?DB=update/obits/dbase&DO=display&ID=1189477693_24578
I only want to scrape the 3 obituary paragraphs. So i figured it'd be easiest to start at the end and move up.
Use:
(//p)[position() > count(//p) - 3]
This selects the last (up to three) p
elements in the XML document.
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|@*">
<xsl:copy-of select="(//p)[position() > count(//p) - 3]"/>
</xsl:template>
</xsl:stylesheet>
When applied against the document, referenced in the question, this transformation evaluates the XPath expression and outputs the selected p
elements.
The result is:
<p>
If you would like to share your thoughts and memories,<br/> we will deliver your message to the family.<br/>
<a href="mailto:staff@sturmfh.com?Subject=For%20the%20Family%20of%20Lyle%20Meier">Click</a>
<a href="mailto:staff@sturmfh.com?Subject=For%20the%20Family%20of%20Lyle%20Meier">
<img src="/images/email_condol.gif" alt="Logo" border="0" align="middle"/>
</a>
<a href="mailto:staff@sturmfh.com?Subject=For%20the%20Family%20of%20Lyle%20Meier">here</a>.
</p>
<p>To Request a Tribute Folder
<br/>
<a href="./obit-foldreq.jhtml?fname=Lyle&lname=Meier">Click</a>
<a href="./obit-foldreq.jhtml?fname=Lyle&lname=Meier">
<img src="/images/email_condol.gif" border="0" alt="View" align="top"/>
</a>
<a href="./obit-foldreq.jhtml?fname=Lyle&lname=Meier">here</a>
</p>