pythongnomenautilusexiftool

Speed up nautilus python-extensions for reading image's Exif


I've written a Nautilus extension which reads picture's metadata (executing exiftool), but when I open folders with many files, it really slows down the file manager and hangs until it finishes reading the file's data.

Is there a way to make Nautilus keep its work while it runs my extension? Perhaps the Exif data could appear gradually in the columns while I go on with my work.

#!/usr/bin/python

# Richiede:
# nautilus-python
# exiftool
# gconf-python

# Versione 0.15

import gobject
import nautilus
from subprocess import Popen, PIPE
from urllib import unquote
import gconf

def getexiftool(filename):
    options = '-fast2 -f -m -q -q -s3 -ExifIFD:DateTimeOriginal -IFD0:Software -ExifIFD:Flash -Composite:ImageSize -IFD0:Model'
    exiftool=Popen(['/usr/bin/exiftool'] + options.split() + [filename],stdout=PIPE,stderr=PIPE)
    #'-Nikon:ShutterCount' non utilizzabile con l'argomento -fast2
    output,errors=exiftool.communicate()
    return output.split('\n')

class ColumnExtension(nautilus.ColumnProvider, nautilus.InfoProvider, gobject.GObject):
    def __init__(self):
        pass

    def get_columns(self):
        return (
            nautilus.Column("NautilusPython::ExifIFD:DateTimeOriginal","ExifIFD:DateTimeOriginal","Data (ExifIFD)","Data di scatto"),
            nautilus.Column("NautilusPython::IFD0:Software","IFD0:Software","Software (IFD0)","Software utilizzato"),
            nautilus.Column("NautilusPython::ExifIFD:Flash","ExifIFD:Flash","Flash (ExifIFD)","Modalit\u00e0 del flash"),
            nautilus.Column("NautilusPython::Composite:ImageSize","Composite:ImageSize","Risoluzione (Exif)","Risoluzione dell'immagine"),
            nautilus.Column("NautilusPython::IFD0:Model","IFD0:Model","Fotocamera (IFD0)","Modello fotocamera"),
            #nautilus.Column("NautilusPython::Nikon:ShutterCount","Nikon:ShutterCount","Contatore scatti (Nikon)","Numero di scatti effettuati dalla macchina a questo file"),
            nautilus.Column("NautilusPython::Mp","Mp","Megapixel (Exif)","Dimensione dell'immagine in megapixel"),
        )

    def update_file_info_full(self, provider, handle, closure, file):
        client = gconf.client_get_default()

        if not client.get_bool('/apps/nautilus/nautilus-metadata/enable'):
            client.set_bool('/apps/nautilus/nautilus-metadata/enable',0)
            return

        if file.get_uri_scheme() != 'file':
            return

        if file.get_mime_type() in ('image/jpeg', 'image/png', 'image/gif', 'image/bmp', 'image/x-nikon-nef', 'image/x-xcf', 'image/vnd.adobe.photoshop'):
            gobject.timeout_add_seconds(1, self.update_exif, provider, handle, closure, file)
            return Nautilus.OperationResult.IN_PROGRESS

        file.add_string_attribute('ExifIFD:DateTimeOriginal','')
        file.add_string_attribute('IFD0:Software','')
        file.add_string_attribute('ExifIFD:Flash','')
        file.add_string_attribute('Composite:ImageSize','')
        file.add_string_attribute('IFD0:Model','')
        file.add_string_attribute('Nikon:ShutterCount','')
        file.add_string_attribute('Mp','')

        return Nautilus.OperationResult.COMPLETE

    def update_exif(self, provider, handle, closure, file):
        filename = unquote(file.get_uri()[7:])

        data = getexiftool(filename)

        file.add_string_attribute('ExifIFD:DateTimeOriginal',data[0].replace(':','-',2))
        file.add_string_attribute('IFD0:Software',data[1])
        file.add_string_attribute('ExifIFD:Flash',data[2])
        file.add_string_attribute('Composite:ImageSize',data[3])
        file.add_string_attribute('IFD0:Model',data[4])
        #file.add_string_attribute('Nikon:ShutterCount',data[5])
        width, height = data[3].split('x')
        mp = float(width) * float(height) / 1000000
        mp = "%.2f" % mp
        file.add_string_attribute('Mp',str(mp) + ' Mp')

        Nautilus.info_provider_update_complete_invoke(closure, provider, handle, Nautilus.OperationResult.COMPLETE)

        return false

Solution

  • That happens because you are invoking update_file_info, which is part of the asynchronous IO system of Nautilus. Therefore, it blocks nautilus if the operations are not fast enough.

    In your case it is exacerbated because you are calling an external program, and that is an expensive operation. Notice that update_file_info is called once per file. If you have 100 files, then you will call 100 times the external program, and Nautilus will have to wait for each one before processing the next one.

    Since nautilus-python 0.7 are available update_file_info_full and cancel_update, which allows you to program async calls. You can check the documentation of Nautilus 0.7 for more details.

    It worth to mention this was a limitation of nautilus-python only, which previously did not expose those methods available in C.

    EDIT: Added a couple of examples.

    The trick is make the process as fast as possible or make it asynchronous.

    Example 1: Invoking an external program

    Using a simplified version of your code, we make asynchronous using GObject.timeout_add_seconds in update_file_info_full.

    from gi.repository import Nautilus, GObject
    from urllib import unquote
    from subprocess import Popen, PIPE
    
    def getexiftool(filename):
        options = '-fast2 -f -m -q -q -s3 -ExifIFD:DateTimeOriginal'
        exiftool = Popen(['/usr/bin/exiftool'] + options.split() + [filename],
                         stdout=PIPE, stderr=PIPE)
        output, errors = exiftool.communicate()
        return output.split('\n')
    
    class MyExtension(Nautilus.ColumnProvider, Nautilus.InfoProvider, GObject.GObject):
        def __init__(self):
            pass
    
        def get_columns(self):
            return (
                Nautilus.Column(name='MyExif::DateTime',
                                attribute='Exif:Image:DateTime',
                                label='Date Original',
                                description='Data time original'
                ),
            )
    
        def update_file_info_full(self, provider, handle, closure, file_info):
            if file_info.get_uri_scheme() != 'file':
                return
    
            filename = unquote(file_info.get_uri()[7:])
            attr = ''
    
            if file_info.get_mime_type() in ('image/jpeg', 'image/png'):
                GObject.timeout_add_seconds(1, self.update_exif, 
                                            provider, handle, closure, file_info)
                return Nautilus.OperationResult.IN_PROGRESS
    
            file_info.add_string_attribute('Exif:Image:DateTime', attr)
    
            return Nautilus.OperationResult.COMPLETE
    
        def update_exif(self, provider, handle, closure, file_info):
            filename = unquote(file_info.get_uri()[7:])
    
            try:
                data = getexiftool(filename)
                attr = data[0]
            except:
                attr = ''
    
            file_info.add_string_attribute('Exif:Image:DateTime', attr)
    
            Nautilus.info_provider_update_complete_invoke(closure, provider, 
                                   handle, Nautilus.OperationResult.COMPLETE)
            return False
    

    The code above will not block Nautilus, and if the column 'Date Original' is available in the column view, the JPEG and PNG images will show the 'unknown' value, and slowly they will being updated (the subprocess is called after 1 second).

    Examples 2: Using a library

    Rather than invoking an external program, it could be better to use a library. As the example below:

    from gi.repository import Nautilus, GObject
    from urllib import unquote
    import pyexiv2
    
    class MyExtension(Nautilus.ColumnProvider, Nautilus.InfoProvider, GObject.GObject):
        def __init__(self):
            pass
    
        def get_columns(self):
            return (
                Nautilus.Column(name='MyExif::DateTime',
                                attribute='Exif:Image:DateTime',
                                label='Date Original',
                                description='Data time original'
                ),
            )
    
        def update_file_info_full(self, provider, handle, closure, file_info):
            if file_info.get_uri_scheme() != 'file':
                return
    
            filename = unquote(file_info.get_uri()[7:])
            attr = ''
    
            if file_info.get_mime_type() in ('image/jpeg', 'image/png'):
                metadata = pyexiv2.ImageMetadata(filename)
                metadata.read()
    
                try:
                    tag = metadata['Exif.Image.DateTime'].value
                    attr = tag.strftime('%Y-%m-%d %H:%M')
                except:
                    attr = ''
    
            file_info.add_string_attribute('Exif:Image:DateTime', attr)
    
            return Nautilus.OperationResult.COMPLETE
    

    Eventually, if the routine is slow you would need to make it asynchronous (maybe using something better than GObject.timeout_add_seconds.

    At last but not least, in my examples I used GObject Introspection (typically for Nautilus 3), but it easy to change it to use the module nautilus directly.