selenium-webdriveraws-lambdagoogle-chrome-headlessruby-2.7

Which ChromeDriver & Headless Chrome versions exist that are compatible with ruby 2.7?


The issue

I have a web scraper running in AWS lambda but in a few weeks AWS lambda will stop supporting Ruby 2.7. I built my scraper last year using this tutorial.

I need to find a version of chrome driver & headless chrome that is compatible with Ruby 2.7, But I don't know exactly where to start.

I have looked at the ChromeDriver's downloads portal But I don't see any indication there that Chrome driver will work for ruby 2.7 or any other specific version of ruby for that matter.

The code I have works by accessing the ChromeDriver binary and starting it inside a specific folder

I downloaded the specific binaries I am using by running these commands:

# serverless chrome
wget https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip
unzip stable-headless-chromium-amazonlinux-2017-03.zip -d bin/
rm stable-headless-chromium-amazonlinux-2017-03.zip

# chromedriver
wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
unzip chromedriver_linux64.zip -d bin/
rm chromedriver_linux64.zip

Solution

  • Solution

    I found the solution to this problem. Ruby 2.7 that Lambda offers by default runs on top of Amazon Linux 2 (which lacks many important libraries & dependencies), unfortunately, there's nothing you can do to change that.

    However, Amazon offers you the ability to run your code in a custom docker image that can be up to 10GB in size.

    I fixed this problem by creating my own image using the following Dockerfile

    FROM public.ecr.aws/lambda/ruby:2.7
    
    # Install dependencies needed to run MySQL & Chrome
    
    RUN yum -y install libX11
    RUN yum -y install dejavu-sans-fonts
    RUN yum -y install procps
    RUN yum -y install mysql-devel
    RUN yum -y install tree
    RUN mkdir /var/task/lib
    RUN cp /usr/lib64/mysql/libmysqlclient.so.18 /var/task/lib
    RUN gem install bundler
    RUN yum -y install wget
    RUN yum -y groupinstall 'Development Tools'
    
    # Ruby Gems
    
    ADD Gemfile ${LAMBDA_TASK_ROOT}/
    ADD Gemfile.lock ${LAMBDA_TASK_ROOT}/
    RUN bundle config set path 'vendor/bundle' && \
        bundle install
    
    # Install chromedriver & chromium
    
    RUN mkdir ${LAMBDA_TASK_ROOT}/bin
    
    # Chromium
    RUN wget https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip
    RUN unzip stable-headless-chromium-amazonlinux-2017-03.zip -d ${LAMBDA_TASK_ROOT}/bin/
    RUN rm stable-headless-chromium-amazonlinux-2017-03.zip
    
    # Chromedriver
    
    RUN wget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip
    RUN unzip chromedriver_linux64.zip -d ${LAMBDA_TASK_ROOT}/bin/
    RUN rm chromedriver_linux64.zip
    
    # Copy function code
    
    COPY app.rb ${LAMBDA_TASK_ROOT}
    
    WORKDIR ${LAMBDA_TASK_ROOT}
    
    RUN tree
    RUN ls ${LAMBDA_TASK_ROOT}/bin
    # Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
    CMD [ "app.handle" ]
    

    Notes

    1. If your code was previously deployed using a zip file you will have to either destroy the previous function or create a second function with the code update, it all comes down to how you want to handle deployment.
    2. It is possible to automate the deployment process using the serverless framework