I have a folder with 452 images (.png) that I'm trying to merge into a single PDF file, using Python. Each of the images are labelled by their intended page number, e.g. "1.png", "2.png", ....., "452.png".
This code was technically successful, but input the pages out of the intended order.
import img2pdf
from PIL import Image
with open("output.pdf", 'wb') as f:
f.write(img2pdf.convert([i for i in os.listdir('.') if i.endswith(".png")]))
I also tried reading the data as binary data, then convert it and write it to the PDF, but this yields a 94MB one-page PDF.
import img2pdf
from PIL import Image
with open("output.pdf", 'wb') as f:
for i in range(1, 453):
img = Image.open(f"{i}.png")
pdf_bytes = img2pdf.convert(img)
f.write(pdf_bytes)
Any help would be appreciated, I've done quite a bit of research, but have come up short. Thanks in advance.
but input the pages out of the intended order
I suspect that the intended order is "in numerical order of file name", i.e. 1.png, 2.png, 3.png, and so forth.
This can be solved with:
with open("output.pdf", 'wb') as f:
f.write(img2pdf.convert(sorted([i for i in os.listdir('.') if i.endswith(".png")], key=lambda fname: int(fname.rsplit('.',1)[0]))))
This is a slightly modified version of your first attempt, that just sorts the file names (in the way your second attempt tries to do) before batch-writing it to the PDF