pythonjsonpython-imaging-librarymetadata

IFDRational is not JSON serializable using Pillow


I am using PIL in python to extract the metadata of an image.

Here is my code:

import json
from PIL import Image, TiffImagePlugin
import PIL.ExifTags

img = Image.open("/home/user/DSCN0010.jpg")

dct = {
        PIL.ExifTags.TAGS[k]: float(v) if isinstance(v, TiffImagePlugin.IFDRational) else v
        for k, v in img._getexif().items()
        if k in PIL.ExifTags.TAGS
    }

print(json.dumps(dct))

I'm getting the following error:

Error processing EXIF data: Object of type IFDRational is not JSON serializable

As you can see in the code, I cast all the values of type IFDRational to float but I'm still getting the error.

Here is the link to the image: https://github.com/ianare/exif-samples/blob/master/jpg/gps/DSCN0010.jpg


Solution

  • The problem is that you are just casting to float the IFDRational values that are directly in the root of the EXIF items. However, it looks like one of those items, called GPSInfo, is a dict that contains internally more IFDRational values.

    You would need a function to sanitise the values, which would ideally iterate recursively all possible nested data so that conversion is done at all levels.

    An initial idea would be to do it like this:

    import json
    
    import PIL.ExifTags
    from PIL import Image, TiffImagePlugin
    
    img = Image.open("/home/user/DSCN0010.jpg")
    
    def sanitise_value(value):
        # Base case: IFDRational to float
        if isinstance(value, TiffImagePlugin.IFDRational):
            return float(value)
    
        # Dict case: sanitise all values
        if isinstance(value, dict):
            for k, v in value.items():
                value[k] = sanitise_value(v)
    
        # List/tuple case: sanitise all values and convert to list,
        # as a tuple in JSON will anyway be a list
        elif isinstance(value, (list, tuple)):
            value = [sanitise_value(i) for i in value]
    
        # Extra case: some values are byte-strings, so you have to
        # decode them in order to make them JSON serializable. I
        # decided to use 'replace' in case some bytes cannot be
        # decoded, but there are other options
        elif isinstance(value, (bytes, bytearray)):
            value = value.decode("utf-8", "replace")
    
        return value
    
    
    dct = {
        PIL.ExifTags.TAGS[k]: sanitise_value(v)  # <-- We use here the sanitising function
        for k, v in img._getexif().items()
        if k in PIL.ExifTags.TAGS
    }
    
    print(json.dumps(dct))
    

    This would work with that image, but feel free to test with other scenarios just in case it's still not a universal solution.

    Plus, keep in mind the comment regarding the byte-strings, because you might need to decode them in a different way depending on your needs. For instance, you might prefer to decode it as latin-1 instead of utf-8, or use other type of error handling as stated here.

    Hope this helps!