pythonpython-2.7imghdr

why do python imghdr test functions take the file as an argument?


I was looking through the source code for the imghdr module, which is part of the python standard library (I use 2.7). The structure is pretty simple—a what function that iterates over a list of functions with names like test_filetype, and if the passed in file matches any of the tests, it returns the string for that filetype.

All of the test_filetype functions take two arguments, h and f. h is a string with the contents of f.read(32), and f is the open file object. None of the test_filetype functions actually use f for anything.

Why would the set of test_filetype functions all take an argument that is never used?


Solution

  • My guess is that this is to allow for custom functions to be added to imghdr.tests . From the documentation of imghdr module -

    You can extend the list of file types imghdr can recognize by appending to this variable:

    imghdr.test

    A list of functions performing the individual tests. Each function takes two arguments: the byte-stream and an open file-like object. When what() is called with a byte-stream, the file-like object will be None.

    The test function should return a string describing the image type if the test succeeded, or None if it failed.

    As can be seen from documentation, the imghdr module allows extension to the tests list. I think the addition argument f could be there for these custom functions that are added to this list.

    Taking a look at the imghdr.what() function -

    if h is None:
        if isinstance(file, basestring):
            f = open(file, 'rb')
            h = f.read(32)
        else:
            location = file.tell()
            h = file.read(32)
            file.seek(location)
    

    As can be seen, when we send in a filename to what() function, it only reads the first 32 bytes from the file and only sends those 32 bytes in the h argument of the test function , I believe the additional f argument maybe for cases where the first 32 bytes are not enough to determine the image format (especially for custom tests).