pythonmetadataopenpyxlxlsx

Extract xlsx workbook file metadata/properties in python 3.6


How can I read the properties/metadata like Title, Author, Subject, Last modified and Keywords stored in a xlsx file using Python? I've used xlrd library however, there are no such properties to extract theses entities. Any Help is appreciated


Solution

  • You may be interested in openpyxl:

    Something to get you started may look like:

    from openpyxl import load_workbook
    wb = load_workbook('yourfile.xlsx')
    wb.properties
    

    Here's the sample output:

    <openpyxl.packaging.core.DocumentProperties object>
    Parameters:
    creator=u'User', title=None, description=None, subject=None, identifier=None,
    language=None, created=datetime.datetime(2018, 12, 11, 9, 55, 2),
    modified=datetime.datetime(2018, 12, 11, 10, 30, 38), lastModifiedBy=u'User',
    category=None, contentStatus=None, version=None, revision=None, keywords=None,
    lastPrinted=None
    

    Is this something you can work with?