pythonmacosmetadataspotlight

Obtaining metadata "Where from" of a file on Mac


I am trying to obtain the "Where from" extended file attribute which is located on the "get info" context-menu of a file in MacOS.

Example

When right-clicking on the file and displaying the info it shows the this metadata.

The highlighted part in the image below shows the information I want to obtain (the link of the website where the file was downloaded from).

'Where from' on MacOS file info

I want to use this Mac-specific function using Python. I thought of using OS tools but couldn't figure out any.


Solution

  • TL;DR: Get the extended attribute like MacOS's "Where from" by e.g. pip-install pyxattr and use xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms").

    Extended Attributes on files

    These extended file attributes like your "Where From" in MacOS (since 10.4) store metadata not interpreted by the filesystem. They exist for different operating systems.

    using the command-line

    You can also query them on the command-line with tools like:

    exiftool -MDItemWhereFroms -MDItemTitle -MDItemAuthors -MDItemDownloadedDate /path/to/file
    
    xattr -p -l -x /path/to/file
    

    On MacOS many attributes are displayed in property-list format, thus use -x option to obtain hexadecimal output.

    using Python

    Ture Pålsson pointed out the missing link keywords. Such common and appropriate terms are helpful to search Python Package Index (PyPi):

    Search PyPi by keywords: extend file attributes, meta data:

    For example to list and get attributes use (adapted from pyxattr's official docs)

    import xattr
    
    xattr.listxattr("file.pdf")
    # ['user.mime_type', 'com.apple.metadata:kMDItemWhereFroms']
    xattr.getxattr("file.pdf", "user.mime_type")
    # 'text/plain'
    xattr.getxattr("file.pdf", "com.apple.metadata:kMDItemWhereFroms")
    # ['https://example.com/downloads/file.pdf']
    

    However you will have to convert the MacOS specific metadata which is stored in plist format, e.g. using plistlib.

    File metadata on MacOS

    Mac OS X 10.4 (Tiger) introduced Spotlight a system for extracting (or harvesting), storing, indexing, and querying metadata. It provides an integrated system-wide service for searching and indexing.

    This metadata is stored as extended file attributes having keys prefixed with com.apple.metadata:. The "Where from" attribute for example has the key com.apple.metadata:kMDItemWhereFroms.

    using Python

    Use osxmetadata to use similar functionality like in MacOS's md* utils:

    from osxmetadata import OSXMetaData
    
    filename = 'file.pdf'
    meta = OSXMetaData(filename)
    
    # get and print "Where from" list, downloaded date, title
    print(meta.wherefroms, meta.downloadeddate, meta.title)
    

    See also