I am using hpple to parse some HTML. I am using Xcode 4.6 and iOS 6.1. It looks like this.
I can extract the text and images by using the following XPathQueryStrings
.
Text ==> //div[@class = 'entry-content']/p
Images ==> //div[@class = 'entry-content']//img/@src
However, I also need to get the text near the bottom "Retiring Stamp Set PDF". This text changes, but the format is usually the same. I tried the following path,
div[@class = 'entry-content']//a[@title]//text()
But that did not work. I am placing all these in an array and I can see that I get a null back for that entry, but I want to get the text. I looked at the XPath Syntax, but can't make it any further. Does anyone have any suggestions?
So I figured it out! So for anyone that has viewed this here is the answer according to my HTML file.
To get the text you use....
//div[@class = 'entry-content']//a[@title]//*
This returns all the text under the a div with a title.