pythonanaconda3file-not-foundtabula-py

FileNotFoundError: [WinError 2] -python


I'm new to python and I'm getting this error when trying to execute the following code which aims to take the contents of this pdf and put it in an excel document. My os is Windows 10 and I'm using VS code via Anaconda3. I'm not sure what I'm doing wrong. Thank you all in advance.

FileNotFoundError: [WinError 2] The system cannot find the file specified

import tabula
file_path = (r"C:\Users\shattv\anaconda3\envs\venv1\TestInvoice.pdf")
oup = (r"C:\Users\shattv\anaconda3\envs\venv1\test.xlsx")
df = tabula.read_pdf(file_path,pages="all")
df.to_excel (oup)

enter image description hereenter image description hereenter image description here

I tried checking os.getcwd and got the same file path:C:\Users\shattv\anaconda3\envs\venv1>. Below are screenshots of the excel and pdf files. I also tried changing to a backslash and still got this error.

C:/Users/shattv/anaconda3/envs/venv1/TestInvoice.pdf"

enter image description hereenter image description here


Solution

  • Try this:

    1. remove r tag in front of the file.

      file_path = ("C:/Users/user/anaconda3/envs/venv1/TestInvoice.pdf")

    These should work. If the above two do not work try this.

    import os.path
    file_path = ("C:/Users/user/anaconda3/envs/venv1/TestInvoice.pdf")
    isFile = os.path.isfile(file_path)
    print(is_file)
    

    If this prints False, then Python can not locate file, and then follow this tutorial. If it prints True try installing Java and putting it in PATH. Tabula is a simple Python wrapper of tabula-java, which can read tables in a PDF and then change there format. Since it is a wrapper of Java you should install have these two things:

    1. Java 8+
    2. Python 3.8+

    Once you have both it should work. If not I do not know how to fix that.