pythondjangodjango-modelspandocpypandoc

Docx to pdf using pandoc in python


So I a quite new to Python so it may be a silly question but i can't seem to find the solution anywhere.

I have a django site I am running it locally on my machine just for development. on the site I want to convert a docx file to pdf. I want to use pandoc to do this. I know there are other methods such as online apis or the python modules such as "docx2pdf". However i want to use pandoc for deployment reasons.

I have installed pandoc on my terminal using brew install pandoc. so it should b installed correctly.

In my django project i am doing:

import pypandoc
import docx

def making_a_doc_function(request):
    doc = docx.Document()
    doc.add_heading("MY DOCUMENT")
    doc.save('thisisdoc.docx')
    pypandoc.convert_file('thisisdoc.docx', 'docx', outputfile="thisisdoc.pdf")     
    pdf = open('thisisdoc.pdf', 'rb')
    response = FileResponse(pdf) 
return response

The docx file get created no problem but it not pdf has been created. I am getting an error that says:

Pandoc died with exitcode "4" during conversion: b'cannot produce pdf output from docx\n'

Does anyone have any ideas?


Solution

  • The second argument to convert_file is output format, or, in this case, the format through which pandoc generates the pdf. Pandoc doesn't know how to produce a PDF through docx, hence the error.

    Use pypandoc.convert_file('thisisdoc.docx', 'latex', outputfile="thisisdoc.pdf") or pypandoc.convert_file('thisisdoc.docx', 'pdf', outputfile="thisisdoc.pdf") instead.