pythonfirefoxplaywrightscreen-capture

How to Capture a Sequence of High-Quality PDF Frames from a Website (Without Screen Recording)?


In Firefox, I can take very high-quality screenshots of a webpage by using Ctrl + P and saving the page as a PDF. This method preserves the text, images, and code in excellent resolution.

Now, I have created a movable bar chart race in Flourish Studio and want to convert it into a high-quality video. However, I do not want to use screen recording tools.

My Goal:
I want to capture 30 high-resolution PDF frames from the website at different points in time (like a video sequence). Ideally, I need a tool or script that can automate the process of saving multiple PDFs from the website as it plays the animation.

What I Tried:
I attempted to write a Python script that:

Opens the local HTML file of my Flourish chart in Firefox using Selenium.
Waits for the page to load.
Listens for the F1 key and triggers Ctrl + P to print the page as a PDF.
However, the script does not save the PDF file in the output folder. I'm not sure why.

Here is my code:

import time
import keyboard
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.options import Options

# Define paths
html_file_path = r"E:\Desktop\New folder (4)\20250309101616805.html"
geckodriver_path = r"E:\Desktop\New folder (4)\geckodriver.exe"
save_path = r"E:\Desktop\New folder (4)\New folder\output.pdf"  # Save PDF location

# Set up Firefox options
options = Options()
options.set_preference("print.always_print_silent", True)  # Silent printing
options.set_preference("print.show_print_progress", False)  # Hide progress
options.set_preference("print.print_to_file", True)  # Print to file
options.set_preference("print.save_print_settings", True)  # Save settings
options.set_preference("print.printer_PDF", "Save as PDF")  # Set printer
options.set_preference("print.print_to_file", True)  # Enable saving print output to file
options.set_preference("print.print_file_name", save_path)  # Define the save location for PDF

# Start WebDriver
service = Service(executable_path=geckodriver_path)
driver = webdriver.Firefox(service=service, options=options)

# Open the HTML file
driver.get("file:///" + html_file_path)

# Wait for the page to load
time.sleep(2)

print("Press F1 to save as PDF.")

# Listen for F1 key press
while True:
    if keyboard.is_pressed('F1'):
        print("F1 pressed, saving as PDF...")
        
        # Trigger print command (Ctrl + P)
        body = driver.find_element(By.TAG_NAME, 'body')
        body.send_keys(Keys.CONTROL + 'p')
        
        # Wait for the print dialog to process
        time.sleep(2)

        print("PDF should be saved to:", save_path)
        break

# Close browser
driver.quit()

My Questions:

Why is my script not saving the PDF in the specified output folder?

Is there a better way to automate capturing 30 sequential PDFs from the website at different animation frames?

Is there any tool or script that can generate a sequence of PDFs (like 30 frames per second) from a webpage?

Important:

I do NOT want to use screen recording tools.

I only need high-quality PDF frames that can later be converted into a video.

Any help would be greatly appreciated!


Solution

  • this problem was solved by following script:

    import time
    from playwright.sync_api import sync_playwright
    
    def capture_pdf_frames(input_html, output_folder, capture_duration, frame_rate=30):
        with sync_playwright() as p:
            # Launch browser
            browser = p.chromium.launch(headless=True, slow_mo=0)  # Remove slow_mo for speed
            # Create a new page
            page = browser.new_page()
            
            # Load the HTML content (local file or URL)
            page.goto(f'file:///{input_html}')
            
            # Disable unnecessary features like animations to speed up rendering
            page.evaluate("document.documentElement.style.animation = 'none'")
            page.evaluate("document.documentElement.style.transition = 'none'")
            
            # Calculate the total number of frames to capture
            total_frames = capture_duration * frame_rate
            
            # Capture frames in parallel using multiple contexts if possible
            for i in range(total_frames):
                # Capture PDF frame and save with a unique name
                output_pdf = f"{output_folder}/frame_{i + 1}.pdf"
                page.pdf(path=output_pdf, format='A4', print_background=True)
            
            # Close the browser after capturing all frames
            browser.close()
    
    # Usage example
    input_html = r"E:\Desktop\New folder (4)\20250309101616805.html"  # Path to your HTML file
    output_folder = r"E:\Desktop\New folder (4)\New folder"  # Path to save the PDF frames
    capture_duration = 5  # Duration in seconds to capture frames
    frame_rate = 60  # Capture 60 frames per second
    
    capture_pdf_frames(input_html, output_folder, capture_duration, frame_rate)
    

    this script wrote in playwright and working well for save pdf frames