pythonpython-zipfilepymupdf

Opening PDF within a zip folder fitz.open()


I have a function that opens a zip file, finds a pdf with a given filename, then reads the first page of the pdf to get some specific text. My issue is that after I locate the correct file, I can't open it to read it. I have tried to use a relative path within the zip folder and a absolute path in my downloads folder and I keep getting the error: no such file: 'Deliverables_Rev B\Plans_Rev B.pdf' no such file: 'C:\Users\MyProfile\Downloads\Deliverables_Rev B\Plans_Rev B.pdf'

I have been commenting out the os.path.join line to change between the relative and absolute path as self.prefs['download_path'] returns my download folder. I'm not sure what the issue with with the relative path is, any insight would be helpful, as I think it has to do with trying to read out of a zipped folder.

import zipfile as ZipFile
import fitz

def getjobcode(self, filename):
    if '.zip' in filename:
        with ZipFile(filename, 'r') as zipObj:
            for document in zipObj.namelist():
                if 'plans' in document.lower():
                    document = os.path.join(self.prefs['download_path'], document)
                    doc = fitz.open(document)
                    page1 = doc.load_page(0)
                    page1text = page1.get_text('text')
                    jobcode = page1text[page1text.index(
                        'PROJECT NUMBER'):page1text.index('PROJECT NUMBER') + 30][-12:]
    return jobcode

Solution

  • I ended up extracting the zip folder into the downloads folder then parsing the pdf to get the data I needed. Afterwords I created a job folder where I wanted it and moved the extracted folder into it from the downloads folder.