pythonpdf

Add text to Existing PDF using Python


I need to add some extra text to an existing PDF using Python, what is the best way to go about this and what extra modules will I need to install.

Note: Ideally I would like to be able to run this on both Windows and Linux, but at a push Linux only will do.

Edit: pypdf and ReportLab look good but neither one will allow me to edit an existing PDF, are there any other options?


Solution

  • I know this is an older post, but I spent a long time trying to find a solution. I came across a decent one using only ReportLab and PyPDF so I thought I'd share:

    1. read your PDF using PdfFileReader(), we'll call this input
    2. create a new pdf containing your text to add using ReportLab, save this as a string object
    3. read the string object using PdfFileReader(), we'll call this text
    4. create a new PDF object using PdfFileWriter(), we'll call this output
    5. iterate through input and apply .mergePage(*text*.getPage(0)) for each page you want the text added to, then use output.addPage() to add the modified pages to a new document

    This works well for simple text additions. See PyPDF's sample for watermarking a document.

    Here is some code to answer the question below:

    packet = StringIO.StringIO()
    can = canvas.Canvas(packet, pagesize=letter)
    <do something with canvas>
    can.save()
    packet.seek(0)
    input = PdfFileReader(packet)
    

    From here you can merge the pages of the input file with another document.