pythonpython-3.xpypdf

PyPDF2.errors.DeprecationError: reader.numPages is deprecated and was removed in PyPDF2 3.0.0. Use len(reader.pages) instead


I want to split a PDF file which has many pages using python and the script is as shown below:

#!/usr/bin/python3

from PyPDF2 import PdfFileWriter, PdfReader

inputpdf = PdfReader(open("/pdf2xls/split_pdf/xyz.pdf", "rb"))

for i in range(inputpdf.numPages):
    output = PdfFileWriter()
    output.addPage(inputpdf.getPage(i))
    with open("/pdf2xls/split_pdf/xyz-page%s.pdf" % (i+1), "wb") as outputStream:
        output.write(outputStream)

However its throwing following error

Traceback (most recent call last):
  File "/root/scripts/pdf2xls/Test/T2/pdf_split.py", line 7, in <module>
    for i in range(inputpdf.numPages):
  File "/usr/local/lib/python3.9/dist-packages/PyPDF2/_reader.py", line 467, in numPages
    deprecation_with_replacement("reader.numPages", "len(reader.pages)", "3.0.0")
  File "/usr/local/lib/python3.9/dist-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File "/usr/local/lib/python3.9/dist-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: reader.numPages is deprecated and was removed in PyPDF2 3.0.0. Use len(reader.pages) instead.

How can I fix this ? Please guide me.


Solution

  • You just need to follow the error messages.

    PyPDF2 got deprecated and moved back to pypdf. When we did that switch, we ensured that the pypdf classes / methods / functions follow the typical Python naming scheme. So we renamed them from snakeCase to camel_case. We also switched from getters (like .getPage(index)) to properties (like .pages[index]).

    You can see that the pages property behaves like a list. If you want to get the number of pages, you just call len( reader.pages).

    More information is in the official migration guide: https://pypdf2.readthedocs.io/en/3.0.0/user/migration-1-to-2.html

    Did you read the error message? It tell you exactly what you should change.

    Yes I did that and now I am getting following error: ``` PyPDF2.errors.DeprecationError: reader.getPage(pageNumber) is deprecated and was removed in PyPDF2 3.0.0. Use reader.pages[page_number] instead.

    You need to work on them one by one. Don't worry, there are not to many and the messages tell you exactly what to do.