htmlcsscommand-line-interfacepngwkhtmltoimage

Convert HTML to PNG with width dependent on content locally


I need to convert HTML documents into PNGs depending on their content. The HTML documents contain an image with some styling dependent on the size of the image/document. I've spent forever looking into and trying solutions but none fit my needs well enough. For example, if an HTML document contains a 500x500 image and some text below it, i'd expect the output to be 500px wide and the height of the image and the text.

wkhtmltoimage is the closest i've come to finding a program like this. it has the smart sizing feature I need (by just setting the width to 1 and letting it expand to fill), but is based on a very old version of webkit. It doesn't support CSS3 calc() or vw. also, it is on ECMAScript 5. Despite ECMAScript 5 having many ways to obtain the width of the document, wkhtmltoimage doesn't support any of them, they all return 0. The size of my text is dependent on the width of the document so I need support for that.

All other solutions that I've found appear to not have support for the smart sizing, since they are based on headless browsers. However, I could have misunderstood these, and they may support what i'm looking for.

For those curious, my actual implementation of this is inside of a python script that will pipe string HTML documents into the program and send the png to other parts of it. However, I don't mind doing work there.

TL;DR Is there some local program that can achieve what I want: converting HTML documents to PNG files with support for vw, calc, and smart width?


Solution

  • This is the best solution I was able to create. It uses selenium and chromium. something very important to note: since it takes forever to boot selenium, I only initialize it ONCE. Make sure that if you execute this code inside an asyncio loop or something to grab a lock so it's only doing one thing at a time.

    Also, either you need to modify the temp_file() function or create a temp/ directory in the working folder.

    This is meant to be in it's own file, use html2png() from other files once imported. as mentioned previously, booting selenium takes a while so it will stall for a couple seconds when imported.

    import json
    import os
    import random
    import string
    import sys
    
    from selenium import webdriver
    
    
    def send(driver, cmd, params=None):
        if params is None:
            params = {}
        resource = "/session/%s/chromium/send_command_and_get_result" % driver.session_id
        url = driver.command_executor._url + resource
        body = json.dumps({'cmd': cmd, 'params': params})
        response = driver.command_executor._request('POST', url, body)
        # if response['status']: raise Exception(response.get('value'))
        return response.get('value')
    
    
    def get_random_string(length):
        return ''.join(random.choice(string.ascii_letters) for i in range(length))
    
    
    def temp_file(extension="png"):
        while True:
            name = f"temp/{get_random_string(8)}.{extension}"
            if not os.path.exists(name):
                return name
    
    
    def loadhtml(driver, html):
        base = "file:///" + os.getcwd().replace("\\", "/")
        # html = html.replace("<base href='./'>", f"<base href='{base}/'>")
        html = f"<base href='{base}/'>" + html
        file = temp_file("html")
        with open(file, "w+") as f:
            f.write(html)
        driver.get("file:///" + os.path.abspath(file).replace("\\", "/"))
        return file
        # print(json.dumps(html))
        # print(html)
        # html_bs64 = base64.b64encode(html.encode('utf-8')).decode()
        # driver.get("data:text/html;base64," + html_bs64)
    
    
    opts = webdriver.ChromeOptions()
    opts.headless = True
    opts.add_experimental_option('excludeSwitches', ['enable-logging'])
    opts.add_argument('--no-proxy-server')
    opts.add_argument("--window-size=0,0")
    opts.add_argument("--hide-scrollbars")
    opts.add_argument("--headless")
    opts.add_argument("--disable-web-security")
    opts.add_argument("--allow-file-access-from-files")
    opts.add_argument("--allow-file-access-from-file")
    opts.add_argument("--allow-file-access")
    opts.add_argument("--disable-extensions")
    # https://chromedriver.storage.googleapis.com/index.html?path=87.0.4280.88/
    if sys.platform == "win32":
        driver = webdriver.Chrome("chromedriver87.exe", options=opts)
    else:
        driver = webdriver.Chrome("chromedriver87", options=opts)
    
    
    def html2png(html, png):
        driver.set_window_size(1, 1)
        tempfile = loadhtml(driver, html)
        func = """
                function outerHeight(element) {
            const height = element.offsetHeight,
                style = window.getComputedStyle(element)
    
            return ['top', 'bottom']
                .map(function (side) {
                    return parseInt(style["margin-"+side]);
                })
                .reduce(function (total, side) {
                    return total + side;
                }, height)
        }"""
        size = driver.execute_script(f"{func};return [document.documentElement.scrollWidth, outerHeight(document.body)];")
        driver.set_window_size(size[0], size[1])
        size = driver.execute_script(f"{func};return [document.documentElement.scrollWidth, outerHeight(document.body)];")
        driver.set_window_size(size[0], size[1])
        send(driver, "Emulation.setDefaultBackgroundColorOverride", {'color': {'r': 0, 'g': 0, 'b': 0, 'a': 0}})
        driver.get_screenshot_as_file(png)
        os.remove(tempfile)
    
    # html2png("<p>test</p>", "test.png")