rubyweb-scrapingnokogiriconditional-attribute

Ruby Conditional argument to method


I have some 'generic' methods that extract data based on css selectors that usually are the same in many websites. However I have another method that accept as argument the css selector for a given website.

I need to call the get_title method if title_selector argument is nos passed. How can I do that?

Scrape that accept css selectors as arguments

  def scrape(urls, item_selector, title_selector, price_selector,     image_selector)
    collection = []
    urls.each do |url|
      doc = Nokogiri::HTML(open(url).read) # Opens URL
      @items = doc.css(item_selector)[0..1].map {|item| item['href']} # Sets items
      @items.each do  |item| # Donwload each link and parse
        page = Nokogiri::HTML(open(item).read)
        collection << {
          :title   => page.css(title_selector).text, # I guess I need conditional here 
          :price  => page.css(price_selector).text
        }
      end
      @collection = collection
    end
  end

Generic title extractor

  def get_title(doc)
    if doc.at_css("meta[property='og:title']")
      title = doc.css("meta[property='og:title']")
    else doc.css('title')
      title = doc.at_css('title').text
    end
  end

Solution

  • Use an or operator inside your page.css call. It will call get_title if title_selector is falsey (nil).

    :title => page.css(title_selector || get_title(doc)).text,
    

    I'm not sure what doc should actually be in this context, though.

    EDIT

    Given your comment below, I think you can just refactor get_title to handle all of the logic. Allow get_title to take an optional title_selector parameter and add this line to the top of your method:

    return doc.css(title_selector).text if title_selector
    

    Then, my original line becomes:

    :title => get_title(page, title_selector)