pythonexcelhyperlinkopenpyxl

Get absolute hyperlink from excel via python script


I am using openpyxl to process information from a excel spreadsheet. The spreadsheet is in Microsoft sharepoint, when some hyperlink is added into it, if the hyperlink is in the same sharepoint, it will automatically been translated to relative URL when I click the "save" button.

If I copy this spreadsheet to some other PC location, the relative hyperlink simply cannot work. Microsoft provide a way to solve this in Help > Set the base address for the links in a workbook. But for openpyxl, it seems no method is provided to retrieve this base url address.

I don't want to hard code it in the python script because I need to process different excel sheets. Is there a way for the python script to translate the relative URL to absolute URL? Or alternatively, is there a way in excel itself to avoid storing the URL in relative form?

The code snippet to retrieve URL is as the following

wb = load_workbook(file, data_only=True)
sheet = wb[sheet_name]

# Extract the hyperlinks
hyperlinks = {}
for row in sheet.iter_rows():
    for cell in row:
        if cell.hyperlink:
            hyperlinks[cell.coordinate] = cell.hyperlink.target  # here the link might be relative

Solution

  • Turn out after setting the base hyperlink, excel stops to convert URL to relative links. So this question can be closed.