pythonimage-processingmetadataexiftooliptc

How do I add different IPTC keywords to multiple images?


I have a folder containing thousands of images and each image needs a unique list of keywords added to it. I also have a table with fields showing the file path and associated list of desired keywords for each image. For example, one record might need the tags, "ORASH (a survey site code), Crew 1, Transect A Upstream, Site Layout". While the next record might need the tags, "ORWLW, Crew 2, Amphibian, Pacific Giant Salamander".

How do I iterate over each image to add the IPTC keywords to them? I'm using python 3 and the iptcinfo3 module but am willing to try other modules that may work.

Here's where I'm at now:

import os
import pandas as pd
from iptcinfo3 import IPTCInfo

srcdir = r'E:\photos'
files = os.listdir(srcdir)

# Create a dataframe from the table containing filepaths and associated keywords.
df = pd.read_excel(r'E:\photo_info.xlsx')

# Create a dictionary with the filename as the key and the tags as the value.
references = dict(df.set_index('basename')['tags'])

for file in files:
    # Get the full filepath for each image.
    filepath = os.path.join(srcdir, file)
    # Create an object for a file that may not have IPTC data (ignore the 'Marker scan...' notification).
    info = IPTCInfo(filepath, force=True)

At this point, I imagined I'd use info['keywords'] = ... in conjunction with the 'references' dictionary to plug the keywords into the correct files. Then info.save_as(filepath). I'm just not experienced enough to know how to make this work or even if it's a reasonable way of doing it. Any help would be appreciated!


Solution

  • I saved the table with the filenames and keywords as a .csv file where the fields and records looked like this (though the text in the 'Subject' field didn't include the quotes):

    SourceFile, Artist, Subject

    E:\photos\0048.JPG, MARY GRAY, "YEAR2022, REQUIRED, GPS UNIT WITH TIME"

    Because I use Jupyter Lab for other portions of this workflow, I ran this code there:

    import os
    
    os.system('cmd d: & exiftool -overwrite_original -sep ", " -csv="E:\photos\metadata.csv" E:\photos')
    

    This opens the Windows command prompt, changes the path to the D: drive (where the exiftool.exe file was saved), invokes exiftool, sets it to overwrite the original image file rather than create a copy, defines the keyword separator in the .csv file, reads the .csv file that has the list of filenames and associated keywords, then runs it on the E:\photos directory.

    Worked like a charm on about 4,000 photos!