pythonwkhtmltopdfpdfkit

How to solve "wkhtmltopdf reported an error: Exit with code 1 due to network error: ProtocolUnknownError" in python pdfkit


I'm using Django. This is code is in views.py.

def download_as_pdf_view(request, doc_type, pk):
    import pdfkit
    file_name = 'invoice.pdf'
    pdf_path = os.path.join(settings.BASE_DIR, 'static', 'pdf', file_name)

    template = get_template("paypal/card_invoice_detail.html")
    _html = template.render({})
    pdfkit.from_string(_html, pdf_path)

    return FileResponse(open(pdf_path, 'rb'), filename=file_name, content_type='application/pdf')

Traceback is below.


[2022-09-05 00:56:35,785] ERROR [django.request.log_response:224] Internal Server Error: /paypal/download_pdf/card_invoice/MTE0Nm1vamlva29zaGkz/
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/usr/local/lib/python3.8/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/project/app/paypal/views.py", line 473, in download_as_pdf_view
    pdfkit.from_string(str(_html), pdf_path)
  File "/usr/local/lib/python3.8/site-packages/pdfkit/api.py", line 75, in from_string
    return r.to_pdf(output_path)
  File "/usr/local/lib/python3.8/site-packages/pdfkit/pdfkit.py", line 201, in to_pdf
    self.handle_error(exit_code, stderr)
  File "/usr/local/lib/python3.8/site-packages/pdfkit/pdfkit.py", line 155, in handle_error
    raise IOError('wkhtmltopdf reported an error:\n' + stderr)
OSError: wkhtmltopdf reported an error:
Exit with code 1 due to network error: ProtocolUnknownError

[2022-09-05 00:56:35,797] ERROR [django.server.log_message:161] "GET /paypal/download_pdf/card_invoice/MTE0Nm1vamlva29zaGkz/ HTTP/1.1" 500 107486

This is work file.

pdfkit.from_url('https://google.com', 'google.pdf')

However pdfkit.from_string and pdfkit.from_file return "ProtocolUnknownError"

Please help me!

Update

I tyied this code.

    _html = '''<html><body><h1>Hello world</h1></body></html>'''
    pdfkit.from_string(_html), pdf_path)

It worked fine. I saved above html as sample.html. Then run this code

    _html = render_to_string('path/to/sample.html')
    pdfkit.from_string(str(_html), pdf_path, options={"enable-local-file-access": ""})

It worked fine! And the "ProtocolUnknownError" error is gone thanks to options={"enable-local-file-access": ""}.

So, I changed the HTML file path to the one I really want to use.

    _html = render_to_string('path/to/invoice.html')
    pdfkit.from_string(_html, pdf_path, options={"enable-local-file-access": ""})
    return FileResponse(open(pdf_path, 'rb'), filename=file_name, content_type='application/pdf')

It does not finish convert pdf. When I run the code line by line.

stdout, stderr = result.communicate(input=input) does not return.

It was processing long time.


Solution

  • I solved this problem. Theare are 3 step to pass this problems.

    1. You need to set options {"enable-local-file-access": ""}. pdfkit.from_string(_html, pdf_path, options={"enable-local-file-access": ""})

    2. pdfkit.from_string() can't load css from URL. It's something like this. <link rel="stylesheet" href="https://path/to/style.css"> css path should be absolute path or write style in same file.

    3. If css file load another file. ex: font file. It will be ContentNotFoundError.

    My solution

    I used simple css file like this.

    body {
        font-size: 18px;
        padding: 55px;
    }
    
    h1 {
        font-size: 38px;
    }
    
    h2 {
        font-size: 28px;
    }
    
    h3 {
        font-size: 24px;
    }
    
    h4 {
        font-size: 20px;
    }
    
    table, th, td {
        margin: auto;
        text-align: center;
        border: 1px solid;
    }
    
    table {
        width: 80%;
    }
    
    .text-right {
        text-align: right;
    }
    
    
    .text-left {
        text-align: left;
    }
    
    .text-center {
        text-align: center;
    }
    

    This code insert last css file as style in same html.

    import os
    
    import pdfkit
    from django.http import FileResponse
    from django.template.loader import render_to_string
    
    from paypal.models import Invoice
    from website import settings
    
    
    def download_as_pdf_view(request, pk):
        # create PDF from HTML template file with context.
        invoice = Invoice.objects.get(pk=pk)
        context = {
            # please set your contexts as dict.
        }
        _html = render_to_string('paypal/card_invoice_detail.html', context)
         # remove header
        _html = _html[_html.find('<body>'):]  
    
        # create new header
        new_header = '''<!DOCTYPE html>
        <html lang="ja">
        <head>
        <meta charset="utf-8"/>
        </head>
        <style>
    '''
        # add style from css file. please change to your css file path.
        css_path = os.path.join(settings.BASE_DIR, 'paypal', 'static', 'paypal', 'css', 'invoice.css')
        with open(css_path, 'r') as f:
            new_header += f.read()
        new_header += '\n</style>'
        print(new_header)
    
        # add head to html
        _html = new_header + _html[_html.find('<body>'):]
        with open('paypal/sample.html', 'w') as f: f.write(_html)  # for debug
    
        # convert html to pdf
        file_name = 'invoice.pdf'
        pdf_path = os.path.join(settings.BASE_DIR, 'static', 'pdf', file_name)
        pdfkit.from_string(_html, pdf_path, options={"enable-local-file-access": ""})
        return FileResponse(open(pdf_path, 'rb'), filename=file_name, content_type='application/pdf')