Tried to extract the below table using Tabula, but it was returning null dataframe. It was working fine for other kinds of similar tables.
Tried using Camelot as well but it didn't work as well. Any suggestions about how can I extract these?
Attached my code
from tabula import read_pdf
from tabulate import tabulate
from tabula import read_pdf
import pandas as pd
# from tabula.io import read_pdf
Page_No = 1
tables = read_pdf('/content/page1.pdf',pages=Page_No,multiple_tables=True)
df1 = pd.DataFrame(tables[0])
df1
import camelot
tables2=camelot.read_pdf('page1.pdf', flavor='lattice', pages='1')
tables2
The issue got fixed after adding flavor='stream' and 'guess=False' in tabula.
from tabula import read_pdf
from tabulate import tabulate
from tabula import read_pdf
import pandas as pd
# from tabula.io import read_pdf
Page_No = 1
tables = read_pdf('/content/page1.pdf',pages=Page_No,guess=False,stream=True)
df1 = pd.DataFrame(tables[0])
df1