I recently used WebGrude for scrape some content from web pages. Then I tried to scrape some search results from e-bay. Here what tried:
@Page("http://www.ebay.com/sch/{0}")
public class PirateBay {
public static void main(String[] args) {
//Search calls Browser, which loads the page on a PirateBay instance
PirateBay search = PirateBay.search("iPhone");
while (search != null) {
search.magnets.forEach(System.out::println);
search = search.nextPage();
}
}
public static PirateBay search(String term) {
return Browser.get(PirateBay.class, term);
}
private PirateBay() {
}
/*
* This selector matches all magnet links. The result is added to this String list.
* The default behaviour is to use the rendered html inside the matched tag, but here
* we want to use the href value instead.
*/
@Selector(value = "#ResultSetItems a[href*=magnet]", attr = "href")
public List<String> magnets;
/*
* This selector matches a link to the next page result, which can be mapped to a PirateBay instance.
* The Link next gets the page on the href attribute of the link when method visit is called.
*/
@Selector("a:has(img[alt=Next])")
private Link<PirateBay> next;
public PirateBay nextPage() {
if (next == null)
return null;
return next.visit();
}
}
But the result is empty. How may I scrape search results using this?
The selector "#ResultSetItems a[href*=magnet]" selects the links where the href attribute has the string "magnet" on its value.
Here you can read more about Atribute selectors: attribute_selectors
What you want is "#ResultSetItems h3.lvtitle a"
To test your selectors there is this nice repl that uses Jsoup, the same library used by webgrude Try jsoup