I have a function that opens a zip file, finds a pdf with a given filename, then reads the first page of the pdf to get some specific text. My issue is that after I locate the correct file, I can't open it to read it. I have tried to use a relative path within the zip folder and a absolute path in my downloads folder and I keep getting the error: no such file: 'Deliverables_Rev B\Plans_Rev B.pdf' no such file: 'C:\Users\MyProfile\Downloads\Deliverables_Rev B\Plans_Rev B.pdf'
I have been commenting out the os.path.join line to change between the relative and absolute path as self.prefs['download_path'] returns my download folder. I'm not sure what the issue with with the relative path is, any insight would be helpful, as I think it has to do with trying to read out of a zipped folder.
import zipfile as ZipFile
import fitz
def getjobcode(self, filename):
if '.zip' in filename:
with ZipFile(filename, 'r') as zipObj:
for document in zipObj.namelist():
if 'plans' in document.lower():
document = os.path.join(self.prefs['download_path'], document)
doc = fitz.open(document)
page1 = doc.load_page(0)
page1text = page1.get_text('text')
jobcode = page1text[page1text.index(
'PROJECT NUMBER'):page1text.index('PROJECT NUMBER') + 30][-12:]
return jobcode
I ended up extracting the zip folder into the downloads folder then parsing the pdf to get the data I needed. Afterwords I created a job folder where I wanted it and moved the extracted folder into it from the downloads folder.