reportlabpdf-readerpdfrw

ReportLab Add Multi-page PDF to Canvas


Using PdfReader along with ReportLab, I am attempting to pull in a PDF page, save it (both successful), then pull in a multi-page PDF and do the same. I know how how to pull in a PDF one page at a time, but I'm struggling to pull in more than one page.

from reportlab.pdfgen import canvas
from pdfrw import PdfReader
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

c = canvas.Canvas(Out_Folder+pdf_file_name)
c.setPageSize([11*inch, 8.5*inch])

page = PdfReader(folder+'2_VisionMissionValues.pdf',decompress=False).pages
p = pagexobj(page[0])
c.setPageSize([11*inch, 8.5*inch]) #Set page size (for landscape)
c.doForm(makerl(c, p))
c.showPage()

p3_ = PdfReader(m4folder+'Academy.pdf',decompress=False).pages

Here's where I'm lost. I know this works for just pulling in the first page....

p3 = pagexobj(p3_[0])

But if I want to pull in all pages of the PDF, I'm not sure what to do. I tried this:

p3 = [pagexobj(x) for x in p3_[:]]

but it resulted in an Assertion Error (see below).

c.setPageSize([8.5*inch, 11*inch]) #Set page size (for portrait)
c.doForm(makerl(c, p3))
c.showPage()
c.save()


AssertionError: [{'/BBox': [0.0, 0.0, 792.0, 612.0], '/Filter': '/FlateDecode', '/FormType': 1, '/Matrix': [0, 1, -1, 0, 0, 0], '/Length': '56', '/Subtype': '/Form', '/Resources': {'/ProcSet': ['/PDF', '/ImageB', '/ImageC', '/ImageI'], '/XObject': {'/Im1': (8, 0)}}, '/Type': '/XObject'}, {'/BBox': [0.0, 0.0, 792.0, 612.0], '/Filter': '/FlateDecode', '/FormType': 1, '/Matrix': [0, 1, -1, 0, 0, 0], '/Length': '56', '/Subtype': '/Form', '/Resources': {'/ProcSet': ['/PDF', '/ImageB', '/ImageC', '/ImageI'], '/XObject': {'/Im2': (17, 0)}}, '/Type': '/XObject'}]

Solution

  • The reportlab canvas only works on one page at a time, so you need to use the reportlab doForm() and showPage() functions once per output page, not on all the pages as a list.

    Edited to add

    I just remembered that I have some sample code that will copy a subset of the pages of a PDF file to an output file using reportlab here. The inner loop does this:

    for page in pages:
        canvas.setPageSize((page.BBox[2], page.BBox[3]))
        canvas.doForm(makerl(canvas, page))
        canvas.showPage()
    

    For what it's worth, if you're only copying pages, you don't need reportlab; there is a similar subset example in the directory above that does it solely with pdfrw.

    (Disclaimer: I am the primary pdfrw author.)