
How to deploy selenium driven spiders on cloud

I use scrapyd to deploy and schedule my spiders on my local machine. The challenge I face now is deploying my spiders that executes with a headless browser.

I get two errors in my log file on scrapyd which are all related to not finding the webdriver in the project directory

FileNotFoundError: [Errno 2] No such file or directory: './chromedriver'

selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. 

Below is a copy of my code

# I'm using SeleniumRequest for my requests so this is the configuration is my settings file 

SELENIUM_DRIVER_NAME = 'chrome' # Change to your browser name
SELENIUM_DRIVER_ARGUMENTS=['--headless']  # '--headless' if using chrome instead of firefox


Here is my spider code

import scrapy
from scrapy_selenium import SeleniumRequest
from scrapy.selector import Selector
import time

class CovidngSpider(scrapy.Spider):
    name = 'covidng'
    #allowed_domains = ['']
    #start_urls = ['']

def start_requests(self):
    yield SeleniumRequest(url ='', wait_time = 3, screenshot = True, callback = self.parse)

def parse(self, response):

    driver = response.meta['driver']
    page_html = driver.page_source
    new_resp = Selector(text=page_html)

    databox = new_resp.xpath("//table[@id='custom3']/tbody/tr")

    for rows in databox:
        state = rows.xpath(".//td[1]/p/text()").get()
        total_cases = rows.xpath(".//td[2]/p/text()").get()
        active_cases = rows.xpath(".//td[3]/p/text()").get()
        discharged = rows.xpath(".//td[4]/p/text()").get()
        death = rows.xpath(".//td[5]/p/text()").get()

        yield {
            'State': state,
            'Total Cases': total_cases,
            'Active Cases': active_cases,
            'Discharged' : discharged,
            'Death': death


  • First: check if you have installed chromedriver because it is not part of Selenium and you have always install it separatelly. (The same is with geckodriver if you use Firefox)

    Second: use /full/path/to/chromedriver - system may run code in different folder then you expect and then relative path ./chromedriver may direct to dirrefent place then you expect.