This is my code:
import tabula
# Specify the path to your PDF file
pdf_path = "path.pdf"
# Use tabula.read_pdf with the default auto method
tables = tabula.read_pdf(pdf_path, pages='all', multiple_tables=True)
# Print each table
for i, table in enumerate(tables):
print(f"Table {i + 1}:\n{table}\n")
And that's the result come out: table extracted using tabula (python script)
But in the pdf, the table will look like: table i want to extract in pdf file
Therefore, I would like to know how to extract the table perfectly like this sample table?
I have found that by adding the lattice to true will make the table looks better like this: table printed out in terminal after using the lattice parameter
tables = tabula.read_pdf(pdf_path, pages='all',
multiple_tables=True,lattice=True)
But there are still redundant column for example the Unnamed: 0 at the beginning and the Unnamed: 1 columns at the end. So, how can i make it better?