I am using PIL in python to extract the metadata of an image.
Here is my code:
import json
from PIL import Image, TiffImagePlugin
import PIL.ExifTags
img = Image.open("/home/user/DSCN0010.jpg")
dct = {
PIL.ExifTags.TAGS[k]: float(v) if isinstance(v, TiffImagePlugin.IFDRational) else v
for k, v in img._getexif().items()
if k in PIL.ExifTags.TAGS
}
print(json.dumps(dct))
I'm getting the following error:
Error processing EXIF data: Object of type IFDRational is not JSON serializable
As you can see in the code, I cast all the values of type IFDRational
to float
but I'm still getting the error.
Here is the link to the image: https://github.com/ianare/exif-samples/blob/master/jpg/gps/DSCN0010.jpg
The problem is that you are just casting to float
the IFDRational
values that are directly in the root of the EXIF items. However, it looks like one of those items, called GPSInfo
, is a dict
that contains internally more IFDRational
values.
You would need a function to sanitise the values, which would ideally iterate recursively all possible nested data so that conversion is done at all levels.
An initial idea would be to do it like this:
import json
import PIL.ExifTags
from PIL import Image, TiffImagePlugin
img = Image.open("/home/user/DSCN0010.jpg")
def sanitise_value(value):
# Base case: IFDRational to float
if isinstance(value, TiffImagePlugin.IFDRational):
return float(value)
# Dict case: sanitise all values
if isinstance(value, dict):
for k, v in value.items():
value[k] = sanitise_value(v)
# List/tuple case: sanitise all values and convert to list,
# as a tuple in JSON will anyway be a list
elif isinstance(value, (list, tuple)):
value = [sanitise_value(i) for i in value]
# Extra case: some values are byte-strings, so you have to
# decode them in order to make them JSON serializable. I
# decided to use 'replace' in case some bytes cannot be
# decoded, but there are other options
elif isinstance(value, (bytes, bytearray)):
value = value.decode("utf-8", "replace")
return value
dct = {
PIL.ExifTags.TAGS[k]: sanitise_value(v) # <-- We use here the sanitising function
for k, v in img._getexif().items()
if k in PIL.ExifTags.TAGS
}
print(json.dumps(dct))
This would work with that image, but feel free to test with other scenarios just in case it's still not a universal solution.
Plus, keep in mind the comment regarding the byte-strings
, because you might need to decode them in a different way depending on your needs.
For instance, you might prefer to decode it as latin-1
instead of utf-8
, or use other type of error handling as stated here.
Hope this helps!