pythonpython-3.xtkinterpypdfpdf-extraction

How to retrieve ALL pages from PDF after button click and then insert it into a text editor PyPDF2


Getting stuck trying to get the entire range of pages to extract from a pdf before inserting it into a text box using PyPDF2. Only successful with individual pages (page = reader.pages[0]).

from tkinter import *
from tkinter import ttk
from tkinter import filedialog
from PyPDF2 import PdfReader
root = Tk()
root.title('PDF Extraction')


def openRoundup():
    pass
    file_name = filedialog.askopenfilename(
        initialdir='/', title="select a file", filetypes=(("PDF Files", ".pdf"), ("Txt Files", ".txt")))

    def readRoundup():
        pass
    reader = PdfReader(file_name)
    all_pages = len(reader.pages)
    print(len(reader.pages))
    count = []
    for i in range(count):
        page = reader.pages[i]
        count.append(page.extract_text())
    print(count)
    text_editor.insert(END, count())


text_editor = Text(root, width=40, height=25)
text_editor.pack()

button1 = Button(root, text='Upload', command=openRoundup)
button1.pack()

root.geometry('800x600')
root.mainloop()


Solution

  • There are a few problems with your code. You've defined count as a list and then try to call range with the list as an argument. You can't do that. You also try to call a function named count(), but haven't defined any such function. And, of course, this function would have the same name as your list, so the code would try to call the list instead of the function.

    If your goal is to insert all of the text from the pdf pages into the text_editor widget, here's the simplest way to do that:

    reader = PdfReader(file_name)
    for page in reader.pages:
        text = page.extract_text()
        text_editor.insert("end", text + "\n")