Using PdfReader along with ReportLab, I am attempting to pull in a PDF page, save it (both successful), then pull in a multi-page PDF and do the same. I know how how to pull in a PDF one page at a time, but I'm struggling to pull in more than one page.
from reportlab.pdfgen import canvas
from pdfrw import PdfReader
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl
c = canvas.Canvas(Out_Folder+pdf_file_name)
c.setPageSize([11*inch, 8.5*inch])
page = PdfReader(folder+'2_VisionMissionValues.pdf',decompress=False).pages
p = pagexobj(page[0])
c.setPageSize([11*inch, 8.5*inch]) #Set page size (for landscape)
c.doForm(makerl(c, p))
c.showPage()
p3_ = PdfReader(m4folder+'Academy.pdf',decompress=False).pages
Here's where I'm lost. I know this works for just pulling in the first page....
p3 = pagexobj(p3_[0])
But if I want to pull in all pages of the PDF, I'm not sure what to do. I tried this:
p3 = [pagexobj(x) for x in p3_[:]]
but it resulted in an Assertion Error (see below).
c.setPageSize([8.5*inch, 11*inch]) #Set page size (for portrait)
c.doForm(makerl(c, p3))
c.showPage()
c.save()
AssertionError: [{'/BBox': [0.0, 0.0, 792.0, 612.0], '/Filter': '/FlateDecode', '/FormType': 1, '/Matrix': [0, 1, -1, 0, 0, 0], '/Length': '56', '/Subtype': '/Form', '/Resources': {'/ProcSet': ['/PDF', '/ImageB', '/ImageC', '/ImageI'], '/XObject': {'/Im1': (8, 0)}}, '/Type': '/XObject'}, {'/BBox': [0.0, 0.0, 792.0, 612.0], '/Filter': '/FlateDecode', '/FormType': 1, '/Matrix': [0, 1, -1, 0, 0, 0], '/Length': '56', '/Subtype': '/Form', '/Resources': {'/ProcSet': ['/PDF', '/ImageB', '/ImageC', '/ImageI'], '/XObject': {'/Im2': (17, 0)}}, '/Type': '/XObject'}]
The reportlab canvas only works on one page at a time, so you need to use the reportlab doForm()
and showPage()
functions once per output page, not on all the pages as a list.
Edited to add
I just remembered that I have some sample code that will copy a subset of the pages of a PDF file to an output file using reportlab here. The inner loop does this:
for page in pages:
canvas.setPageSize((page.BBox[2], page.BBox[3]))
canvas.doForm(makerl(canvas, page))
canvas.showPage()
For what it's worth, if you're only copying pages, you don't need reportlab; there is a similar subset example in the directory above that does it solely with pdfrw.
(Disclaimer: I am the primary pdfrw author.)