I want to reference the last page from a bunch of PDF documents and parse tables from it, however the number of pages in the documents can vary. What I do know is that the last page is the same for these documents.
all_tables_stream = tabula.read_pdf(path, password = password, stream = "True", pages = 'all')
Is there an elegant way to do this where I don't have to scrape all pages in the document just to get to the tables on the final page?
First you should get the number of pages, for example by using pyPdf
import pyPdf
from tabula import read_pdf
reader = pyPdf.PdfFileReader(open(path, mode='rb' ))
n = reader.getNumPages()
all_tables_stream = tabula.read_pdf(path, password = password, stream = "True", pages = n)