Currently trying to scrape the link and title from the following piece of HTML and cannot seem to find any way of doing it despite reading the scrapy docs for a while.
<h3 class="data">
<a href="example.com" title="uniqueTitle"></a>
</h3>
Whats the best way of doing this? Also I should note that there are many of these <h3>
elements on the page with the same class but different <a>
tags that I want to scrape.
Thanks in advance!
To get all the url within the h3
tags, you can use e.g
from scrapy import Selector
sel = Selector(text='''<h3 class="data">
<a href="example.com" title="uniqueTitle"></a>
</h3>''')
print(sel.css('h3.data > a::attr(href)').extract()) # you can use this
Output:
['example.com']