pythondockerdockerfileselenium-firefoxdriverprivoxy

Docker run hangs when starting provixy prior to containerized app


I have a python FastAPI app that runs in a Kubernetes cluster on GKE. I'm trying to get the outbound traffic from the containers (pods) to route through privoxy. To test this I'm building/running the container on my local machine, but when I run the docker run -p 8080:8080 privoxy command I get the following output/log that hangs on the last line:

2020-09-08 13:32:15.342 7fb59e36de80 Info: Privoxy version 3.0.26
2020-09-08 13:32:15.342 7fb59e36de80 Info: Program name: privoxy
2020-09-08 13:32:15.344 7fb59e36de80 Info: Loading filter file: /etc/privoxy/default.filter
2020-09-08 13:32:15.345 7fb59e36de80 Info: Loading filter file: /etc/privoxy/user.filter
2020-09-08 13:32:15.345 7fb59e36de80 Info: Loading actions file: /etc/privoxy/match-all.action
2020-09-08 13:32:15.345 7fb59e36de80 Info: Loading actions file: /etc/privoxy/default.action
2020-09-08 13:32:15.348 7fb59e36de80 Info: Loading actions file: /etc/privoxy/user.action
2020-09-08 13:32:15.348 7fb59e36de80 Info: Listening on port 8118 on IP address 0.0.0.0

My question is, "How can I startup privoxy inside the Docker container at run time and then run my app (using privoxy) without privoxy hanging or throwing an error?"

My Dockerfile looks something like this:

FROM continuumio/miniconda3:4.6.14

# ...

# Install Privoxy
RUN set -xe \
    && apt-get update \
    && apt-get install -y privoxy \
    && curl -sSL https://github.com/tianon/gosu/releases/download/1.9/gosu-amd64 > /usr/sbin/gosu \
    && chmod +x /usr/sbin/gosu

RUN sed -i -e '/^listen-address  \[/s/listen-address/#listen-address/' \
           -e '/^enforce-blocks/s/0/1/' \
           -e '/^#debug/s/#//' /etc/privoxy/config
VOLUME /etc/privoxy
EXPOSE 8118

# Install Firefox
RUN apt-get update && \
    apt-get -y install firefox-esr

# Install Geckodriver
RUN wget https://github.com/mozilla/geckodriver/releases/download/v0.24.0/geckodriver-v0.24.0-linux64.tar.gz && \
    tar xzf geckodriver-v0.24.0-linux64.tar.gz && \
    mv geckodriver /usr/bin/geckodriver


# ...

CMD start.sh

start.sh looks like this:

#!/usr/bin/env bash
gosu privoxy privoxy --no-daemon /etc/privoxy/config
cd /code
python app.py

When I start up the webdriver/Selenium I use this function:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options as FFOptions
from selenium.webdriver.firefox.webdriver import WebDriver as FirefoxWebDriver

def get_container_firefox_driver(windows_mask: bool=True):
    # create a new FireFox session
    os.environ['MOZ_FORCE_DISABLE_E10S'] = '1'

    ff_options = FFOptions()
    ff_options.add_argument('-new-instance')
    ff_options.add_argument('-headless')

    ff_profile = webdriver.FirefoxProfile()
    # set some privacy settings
    ff_profile.set_preference("places.history.enabled", False)
    ff_profile.set_preference("privacy.clearOnShutdown.offlineApps", True)
    ff_profile.set_preference("privacy.clearOnShutdown.passwords", True)
    ff_profile.set_preference("privacy.clearOnShutdown.siteSettings", True)
    ff_profile.set_preference("privacy.sanitize.sanitizeOnShutdown", True)
    ff_profile.set_preference("signon.rememberSignons", False)
    ff_profile.set_preference("network.cookie.lifetimePolicy", 2)
    ff_profile.set_preference("network.dns.disablePrefetch", True)
    ff_profile.set_preference("network.http.sendRefererHeader", 0)

    # set socks proxy
    ff_profile.set_preference("network.proxy.type", 1)
    ff_profile.set_preference("network.proxy.socks_version", 5)
    ff_profile.set_preference("network.proxy.socks", '127.0.0.1')
    ff_profile.set_preference("network.proxy.socks_port", 8118)
    ff_profile.set_preference("network.proxy.socks_remote_dns", True)

    # get a speed increase by not downloading images
    ff_profile.set_preference("permissions.default.image", 2)

    driver = webdriver.Firefox(
        firefox_profile=ff_profile,
        options=ff_options,
        executable_path="/usr/bin/geckodriver",
    )
    return driver

Other Attempts

I have also tried NOT commenting out the listen-address [::1]:8118 line of the privoxy config file by using this sed command in the Dockerfile:

RUN sed -i -e '/^enforce-blocks/s/0/1/' \
           -e '/^#debug/s/#//' /etc/privoxy/config

... but when I do that I get the following "Fatal error" when privoxy is starting up in the container:

2020-09-08 14:21:16.844 7fa4d8646e80 Info: Privoxy version 3.0.26
2020-09-08 14:21:16.844 7fa4d8646e80 Info: Program name: privoxy
2020-09-08 14:21:16.845 7fa4d8646e80 Info: Loading filter file: /etc/privoxy/default.filter
2020-09-08 14:21:16.847 7fa4d8646e80 Info: Loading filter file: /etc/privoxy/user.filter
2020-09-08 14:21:16.847 7fa4d8646e80 Info: Loading actions file: /etc/privoxy/match-all.action
2020-09-08 14:21:16.847 7fa4d8646e80 Info: Loading actions file: /etc/privoxy/default.action
2020-09-08 14:21:16.849 7fa4d8646e80 Info: Loading actions file: /etc/privoxy/user.action
2020-09-08 14:21:16.850 7fa4d8646e80 Info: Listening on port 8118 on IP address 127.0.0.1
2020-09-08 14:21:16.850 7fa4d8646e80 Fatal error: can't bind to ::1:8118: Cannot assign requested address

Solution

  • Based on the operating system being used, the startup command for privoxy is incorrect. Per the privoxy startup manual the startup command should be /etc/init.d/privoxy start --no-daemon for a debian linux OS. Also, the gosu command before this command is not required and will crash the privoxy startup. The start.sh should look like this:

    #!/usr/bin/env bash
    /etc/init.d/privoxy start --no-daemon
    cd /code
    python app.py
    

    Commenting out the listen-address [::1]:8118 line with RUN sed -i -e '/^listen-address \[/s/listen-address/#listen-address/ ... prevents a Fatal error during privoxy startup so it should be used.