ruby-on-railscapybaracapybara-webkitscraper

Capybara: Scraper visits Canadian website instead of US. Indeed.com


I've modified an out of date bot that applies to jobs on indeed.com, but they decided to renovate their site again so as you can imagine, things no longer work and the bot is once more out of date.

I'm wondering how it is possible to visit the US version of the site indeed.com. There used to be a link that said "for US, click here" but they removed it entirely. Now when running the bot, I get job postings exclusively in Canada.

HOWEVER, when I visit indeed.com from my browser as a Canadian resident, it takes me directly to the US site. Things just don't make sense. Is the bot downloading a different page? Is there a way to specify in the code that i seek the US site or that my browser hails from a US region/IP-address?

Thank you in advance.

Here is the original code: https://github.com/jmopr/job-hunter/blob/master/scraper.rb

One additional problem, since i don't use selenium and instead use the webkit. It seems that I am unable to use the command save_and_open_page. Is there an alternative for webkit? It would make me able to see the site that the bot is visiting and make debugging much easier.


Solution

  • If I visit the Canadian site ca.indeed.com there is still a link at the bottom for US jobs, not sure whether that's there for you or not. save_and_open_page and save_and_open_screenshot should both work with the capybara-webkit driver (which is what specifying :webkit is getting you) as long as you call them on page, however why not just swap over to using Firefox or Chrome for this so you can see exactly whats happening.

    Remove the Capybara::Webkit.configure, and require 'capybara-webkit'. Instead require selenium-webdriver and set Capybara.default_driver (and Capybara.javascript_driver if you want although it's actually not doing anything in that code and could be removed) to :selenium for Firefox or :selenium_chrome for Chrome.