I'm building a link scraper in CasperJS, and the main functions looks pretty much like this:
function findLinks() {
return Array.prototype.map.call(document.querySelectorAll('a'), function(e){
return {
href: e.href,
title: e.title,
rel: e.rel,
anchor: e.text,
innerHTML: e.innerHTML
};
});
}
However, I'd like to modify findLinks()
in a way that if my link scraper finds something like this:
<a href="#" title="anchor tag" rel="nofollow"><img src="myimage.jpg" alt="beautiful image" /></a>
I can access <img>
attributes individually, just as I do it with the links.
I've been reading Mozilla MDN, and CasperJS and I haven't found yet a way to achieve this,
Any help will be greatly appreciated!
You're looking for Element.children
children returns a collection of child elements of the given element.
In your example HTML:
var b = document.querySelectorAll('a')[0];
alert(b.children[0].src); //First child's source: myimage.jpg