[SOLVED] How to find the original format of images (pages) present in a tiff file using python?

How to find the original format of images (pages) present in a tiff file using python?

I have a multi-page tiff file (merged.tiff) out of which I need to extract individual images in their original format. PIL allows you to iterate through pages and writing them to disk in a format I need (png/jpg). Ex:

from PIL import Image
img = Image.open('merged.tiff')
for i in range(img.n_frames):
    try:
        img.seek(i)
        img.save(f'individual_{i}.jpg')
        img.save(f'individual_{i}.png')
    except EOFError:
        break

But is there a way to know the original format of those images? I have tried with tifffile and tiffany which allow me to convert the pages to a numpy array and then write to disk as an image, but they don't allow me to know the source format of the images contained in the TIFF file.

Solution

In the most general case, I believe this is impossible, because it is perfectly feasible to take, say, a JPEG image and include it in the TIFF file as an uncompressed RGB array.

Realistically, though, you should be able to look at some of the tags of the TIFF file, e.g. Compression, to make an educated guess about what the image used to be. Tools like tiffinfo and tiffdump (from the libtiff package) can be used to examine the TIFF file.