I cannot get this to work right for the life of me. I'm trying to load a web-page and click a button on it and I cant get it to work. Either Selenium complains, does not load, complains it cant make a session, complains that it does not have proper options, loads forever or just straight up does not work.
Dockerfile
FROM python:3.11-slim-buster
USER root
# Create a non-root user
RUN useradd -ms /bin/bash appuser
WORKDIR /app
RUN chown appuser:appuser /app
USER appuser
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy
COPY src .
# Expose the application port (e.g., 5000)
EXPOSE 5000
# Define the command to run the application
CMD ["python3", "app.py"]
Docker-compose.yml
version: '3.8'
services:
chrome:
image: selenium/node-chrome:3.14.0-gallium
volumes:
- /dev/shm:/dev/shm
depends_on:
- hub
environment:
HUB_HOST: hub
hub:
image: selenium/hub:3.14.0-gallium
ports:
- "4444:4444"
web:
build: .
depends_on:
- hub
volumes:
- ./src:/app
ports:
- "5000:5000"
app.py
from flask import Flask, render_template, request
import requests
import re
import os
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import urllib.parse
from selenium.webdriver.chrome.options import Options
def download_page(url):
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.page_load_strategy = 'normal'
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--lang=en')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--allow-running-insecure-content')
chrome_options.add_argument('--disable-notifications')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_argument('--disable-browser-side-navigation')
chrome_options.add_argument('--mute-audio')
chrome_options.add_argument('--force-device-scale-factor=1')
chrome_options.add_argument('window-size=1080x760')
driver = webdriver.Remote('http://hub:4444/wd/hub')
driver.get(url)
//Process page or click buttons
app = Flask(__name__)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/process', methods=['POST'])
def process():
url = request.form['url']
download_page(url)
return "URL processing complete!"
if __name__ == '__main__':
app.run(host='0.0.0.0',debug=True)
index.html
<!DOCTYPE html>
<html>
<head>
<title>URL Processor</title>
</head>
<body>
<h1>Enter a URL to process:</h1>
<form method="POST" action="/process">
<input type="text" name="url" placeholder="Enter URL here">
<button type="submit">Process URL</button>
</form>
</body>
</html>
I have tried using selenium/standalone-chrome as the docker base, but it does not allow pip to install flask because its "controlled externaly"
I have tried loading it external but it complains it cant make a session. SessionNotCreatedException
I tried loading it internally but it complains it cant find the chrome driver and when i tried installing it just hung. no error. nothing just sat there.
If i just run it as a standalone without flask it works PERFECTLY fine. Its just when I tried to wrap it into a docker file it stops me at every turn. It also does not help that the documentation for selenium is outdated.
You are creating the chrome options but you are not passing them to the WebDriver.
When I add the options to it, it works fine for me.
Change this line:
driver = webdriver.Remote('http://hub:4444/wd/hub')
to
driver = webdriver.Remote('http://hub:4444/wd/hub', options=chrome_options)