pythonpdf-generationrelative-pathreportlab

How to create relative links to files in PDF using ReportLab in Python?


I am generating a PDF file using ReportLab in Python. The structure of my project looks like this:

project_root/
│
├── workspace/
│   └── generated_report.pdf
│
└── data/
    ├── folder_1/
    │   └── file_1.txt
    └── folder_2/
        └── file_2.txt

I generate the PDF inside the workflow folder, and the data folder contains nested directories and files that I want to link to within the PDF.

Here is my code



import os

from reportlab.platypus import Paragraph



def generate_link(pdf_path, file_path, styles):

    link = f'file://{os.path.join(os.path.dirname(os.path.abspath(pdf_path)), os.pardir, os.path.relpath(file_path))}'

    return Paragraph(f'<a href="{link}">{os.path.relpath(file_path)}</a>', styles['BodyText'
])

This code generates absolute links in the PDF, but I want to make them relative so that if I move the workflow and data folders to another location, the links will still work.


Solution

  • When using <a href="STRING_A">STRING_B</a>, STRING_A is the value that will be used for the link/redirection while STRING_B is the (mandatory) value that is displayed in place of the link.

    To generate relative links, you may want to change the value of STRING_A and apply a os.path.relpath like you did on STRING_B. Note that os.path.relpath can take a second argument [0] called start. If you are not executing your script from the root directory of your "workflow"/project, you will have to set start to this path.

    import os
    from reportlab.platypus import Paragraph
    
    
    # Added a `project_root` parameter, it is the path of your project/workflow root.
    # If `project_root` is the same as `pdf_path`, just remove the param and replace
    # `project_root` by `pdf_path` in the rest of the function.
    #                                              \/\/\/\/\/\/
    def generate_link(pdf_path, file_path, styles, project_root):
        # The way you compute this path seems a little bit convoluted
        # but if it works, you may keep it as-is.
        computed_pdf_path = os.path.join(os.path.dirname(os.path.abspath(pdf_path)), os.pardir, os.path.relpath(file_path))
    
        relative_path = os.path.relpath(computed_pdf_path, os.path.dirname(project_root))
        link = f'file://{relative_path}'
    
        return Paragraph(f'<a href="{link}">{relative_path}</a>', styles['BodyText'])
    

    [0] https://docs.python.org/3/library/os.path.html#os.path.relpath