pythonselenium-webdriverpyinstaller

Files downloaded from my web crawler using selenium chromedriver is outputted into the wrong directory


I'm trying to package my main.py script into an executable file using PyInstaller. The script contains a web crawler that uses Selenium with chromedriver.exe to navigate to a website and automatically download files (PDFs) into a specific directory named "Files," located in the same directory as main.py. Here’s a screenshot of the expected file structure for clarity.

When I run main.py directly, everything works as expected, with downloads going to the "Files" directory. However, after packaging with PyInstaller using the following command:

pyinstaller --onefile --add-data "chromedriver.exe;." --add-data "urls.txt;." main.py

and running the resulting .exe file (with chromedriver.exe and urls.txt included in the same directory), I encounter an issue: while the .exe successfully launches Chrome and downloads the files, it no longer creates or uses the "Files" directory in the same location. Instead, the downloads are saved to a temporary directory like C:\Users\{username}\AppData\Local\Temp\_MEI78762\Files, which is deleted after the program exits, so the downloaded files are inaccessible.

Below is the code I’m using to set the download path. The logic tries to detect the base path of the executable, but it’s not working as expected:

# Determine the base path
if getattr(sys, 'frozen', False):
    # If the application is run as a bundle, the PyInstaller bootloader
    # extends the sys module by a flag frozen=True and sets the app 
    # path into variable _MEIPASS'.
    base_path = sys._MEIPASS
else:
    base_path = os.path.abspath(".")

# Create the Files directory if it doesn't exist
download_dir = os.path.join(base_path, "Files")
if not os.path.exists(download_dir):
    os.makedirs(download_dir)
# Extract all URLS from urls.txt and store in a variable call urls
urls = []
with open("./test_urls.txt", "r") as file:
    urls = file.readlines()

# Configure Chrome options to set the download directory and disable the download prompt
chrome_options = webdriver.ChromeOptions()
prefs = {
    "download.default_directory": download_dir,
    "download.prompt_for_download": False,
    "directory_upgrade": True,
    "safebrowsing.enabled": True,
    "safebrowsing.disable_download_protection": True,  # Disable download protection
    "profile.default_content_setting_values.automatic_downloads": 1,  # Allow automatic downloads
    "profile.default_content_settings.popups": 0,  # Disable popups
    "profile.content_settings.exceptions.automatic_downloads.*.setting": 1  # Allow multiple downloads
}

Solution

  • When you pack your script into a standalone executable, the executable unpacks files unto a temporary directory (like _MEIPASS) by default. To fix this you need to modify the base_path to point to the directory where the executable is located. We can use sys.executable to do this when sys.frozen is true.

    This implementation would look something like this:

    import os
    import sys
    from selenium import webdriver
    
    # Determine the base path
    if getattr(sys, 'frozen', False):
        # Running as a PyInstaller bundle, use the directory of the executable
        base_path = os.path.dirname(sys.executable)
    else:
        # Running as a script, use the current working directory
        base_path = os.path.abspath(".")
    
    # Define the download directory for "Files" within base_path
    download_dir = os.path.join(base_path, "Files")
    
    # Create the "Files" directory if it doesn't exist
    if not os.path.exists(download_dir):
        os.makedirs(download_dir)
    
    # Define the path to urls.txt and check if it exists
    urls_file = os.path.join(base_path, "urls.txt")
    if not os.path.isfile(urls_file):
        raise FileNotFoundError(f"Expected 'urls.txt' in {base_path}. Please place 'urls.txt' in the same directory as the executable.")
    
    # Read URLs from urls.txt
    urls = []
    with open(urls_file, "r") as file:
        urls = file.readlines()
    
    # Configure Chrome options for Selenium
    chrome_options = webdriver.ChromeOptions()
    prefs = {
        "download.default_directory": download_dir,
        "download.prompt_for_download": False,
        "directory_upgrade": True,
        "safebrowsing.enabled": True,
        "safebrowsing.disable_download_protection": True,
        "profile.default_content_setting_values.automatic_downloads": 1,
        "profile.default_content_settings.popups": 0,
        "profile.content_settings.exceptions.automatic_downloads.*.setting": 1
    }
    chrome_options.add_experimental_option("prefs", prefs)
    
    # Initialize the Chrome WebDriver
    driver = webdriver.Chrome(executable_path=os.path.join(base_path, "chromedriver.exe"), options=chrome_options)