pythonpdf-readertabula

read_pdf FileNotFoundError: [Errno 2] No such file or directory: in Python


I am trying to scrape tables from pdf with read_pdf in python. I am using read_pdf but it doesn't do the job. Also, to mention, I do this in MAC with Jupiter notebook. This is what I do:

from tabula import read_pdf
file = read_pdf(r'C:\Users\myname\Rprojects\Reports_scraping\data_scraped\icnarc_29052020\icnarc_200529.pdf')

I get this error:

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\myname\\Rprojects\\Reports_scraping\\data_scraped\\icnarc_29052020\\icnarc_200529.pdf'

How I can solve this issue?


Solution

  • just to check that the file exist, do you get True when running this:

    import os
    
    
    file_path = r'C:\Users\myname\Rprojects\Reports_scraping\data_scraped\icnarc_29052020\icnarc_200529.pdf'
    print( os.path.isfile(file_path))
    

    Edit file_path with wherever is the file(using Python 3). And did you change "myname" in the path with your actual username... (just in case)

    It is preferable to build your paths using os.path.join to make things compatible, on windows it will need to create a root "config.py" file, see

    how to get the root folder on windows

    #

    having discussed with GaB, it seemed that he is using Jupyter notebook on Mac, which explains issues, I saw this link, but can't help more.

    Jupyter - import pdf

    os.path.join doc