I have some 'generic' methods that extract data based on css selectors that usually are the same in many websites. However I have another method that accept as argument the css selector for a given website.
I need to call the get_title method if title_selector argument is nos passed. How can I do that?
def scrape(urls, item_selector, title_selector, price_selector, image_selector)
collection = []
urls.each do |url|
doc = Nokogiri::HTML(open(url).read) # Opens URL
@items = doc.css(item_selector)[0..1].map {|item| item['href']} # Sets items
@items.each do |item| # Donwload each link and parse
page = Nokogiri::HTML(open(item).read)
collection << {
:title => page.css(title_selector).text, # I guess I need conditional here
:price => page.css(price_selector).text
}
end
@collection = collection
end
end
def get_title(doc)
if doc.at_css("meta[property='og:title']")
title = doc.css("meta[property='og:title']")
else doc.css('title')
title = doc.at_css('title').text
end
end
Use an or
operator inside your page.css
call. It will call get_title
if title_selector
is falsey (nil).
:title => page.css(title_selector || get_title(doc)).text,
I'm not sure what doc
should actually be in this context, though.
EDIT
Given your comment below, I think you can just refactor get_title
to handle all of the logic. Allow get_title
to take an optional title_selector
parameter and add this line to the top of your method:
return doc.css(title_selector).text if title_selector
Then, my original line becomes:
:title => get_title(page, title_selector)