jpegexifwebp

Exif and EXIF are two different Chunks with different format


I am creating a parser which would extract the Exif Meta Data from an image, I followed this documentation to parse the exif data from a JPEG file and it worked,

// Here is the first 16 bytes of exif which is mention in the documentation
45 78 69 66     (Exif)
00 00           (padding)
49 49           (byte align)
2A 00           (byte align)
08 00 00 00     (offset)
0C 00.          (number of attribute)

Then i started to extract the Exif Data from .webp file format but i didn't had any webp image with exif data present so i went to this exif Editor and inserted the Exif data, but it inserted the data in a different format

45 58 49 46             (EXIF)   // note that this header is different from the previous one
1E 02                   (chunk size big-endian)
00 00                   (padding maybe or could be part of chunk size)
4D 4D                   (byte align)
00 2A                   (byte align)
00 00 00 10             (offset)
45 78 69 66 4D 65 74 61 (ExifMeta).  // i don't know why this is there

I thought that maybe the website is inserting the Data Wrong so i went and viewed the exif data present in the image using a online exif view, and it was there, so i dont understand why there are two different structure of storing EXIF data, and where can i find the documentation on how to parse the 2nd type of Exif data


Solution

  • You cannot blindly scan a file for 45 78 69 66 or 45 78 69 66 00 00 to then expect a full Exif metadata structure, since other things in each file might have those byte sequences by conincidence - most likely through user texts/comments. You have to treat each file as per its format.

    JFIF = JPEG file interchange format

    It has its own format, not used by anyone else.

    A JFIF file consists of a sequence of markers or marker segments (for details refer to JPEG, Syntax and structure). ... Each marker consists of two bytes: an FF byte followed by a byte which is not equal to 00 or FF and specifies the type of the marker. Some markers stand alone, but most indicate the start of a marker segment that contains data bytes according to the following pattern:

    FF xx s1 s2 [data bytes]

    The bytes s1 and s2 are taken together to represent a big-endian 16-bit integer specifying the length of the following "data bytes" plus the 2 bytes used to represent the length.

    With that knowledge you parse a JFIF file, iterating through all the segments. One (or multiple or zero) of the segments is called APP1, identified by FF E1 followed by two bytes for the size, followed by the actual payload bytes. Each APP1 segment is identified by its first bytes until the first 00 byte - in this case it's Exif\0\0 (45 78 69 66 00 00), where the first 00 byte is the identification termination and only the second 00 byte is for padding reasons. Other possible identifications are:

    And after that identification the overall Exif metadata payload starts.

    WebP = Web Picture

    It uses RIFF, which is known for decades other file formats like WAV or AVI. It is Microsoft's adaption of IFF and Apple's QTFF - all 3 are quite similar:

    RIFF files consist entirely of "chunks". ... All chunks have the following format:

    • 4 bytes: an ASCII identifier for this chunk (examples are fmt and data; note the space in fmt ).
    • 4 bytes: an unsigned, little-endian 32-bit integer with the length of this chunk.
    • variable-sized field: the chunk data itself, of the size given in the previous field.
    • a pad byte, if the chunk's length is not even.

    With that knowledge you parse a RIFF file, iterating through all the chunks. One (or multiple or zero) of the chunks have the identification EXIF (45 58 49 46) and its payload is then the Exif metadata.

    TIFF = Tagged Image File Format

    This is not only a file format on its own, but also used entirely for Exif:

    Every TIFF file begins with a two-byte indicator of byte order: II (49 49) for little-endian (a.k.a. "Intel") or MM (4d 4d) for big-endian (a.k.a. "Motorola") byte ordering. The next two-byte word contains the format version number, which has always been 42... All two-byte words, double words, etc., in the TIFF file are assumed to be in the indicated byte order.

    See the official TIFF 6.0 specification for how to parse this format (cannot find where Adobe stores it currently). Also Exif has an official documentation. Parsing this format is a bit more challenging than parsing JFIF or RIFF.

    Conclusion

    The bottom line is: don't confuse multiple formats - parse them separately. If you can parse JFIF and RIFF you should be able to extract the identical payload of byte of the Exif metadata. Parsing Exif should be done separately again, just like one would parse a TIFF file.

    Exif can also reside in other files:

    As you see: scanning for Exif alone would not find every occurance and may also produce false positives. It's by far more robust to just adhere to each file's format. Those formats have all their advantages and disadvantages, and also different and even multiple ways how Exif metadata can be stored in there. One way to better understand all this is to generate Exif metadata in each file format with a long and unique UserComment text that you can easily spot in any of the files.

    00 00 00 10             (offset)
    45 78 69 66 4D 65 74 61 (ExifMeta).  // i don't know why this is there
    

    You don't need to know: the offset tells you where to look next. In your previous file it was 08 00 00 00 (different endianess), indicating to look at offset 8, which is just the next byte (since the it is inclusive, counting the header of 8 bytes already). In this file it is 16, which means the next 8 bytes are undefined for various reasons (which can also be used stuffing in some kind of advertizement). Just skip it and off you go for the actual TIFF/Exif content.