pythonpython-3.xtransactions

Writing atomic functions in Python


In many systems, transactions are used to group operations together so that they are atomic—meaning they either all succeed or all fail. However, in Python itself does not seem to have built-in support for transactions outside of the context of databases

For example I have two functions as follows:

create_user_record(user_record)
create_directory(directory_name)

I want to wrap these functions in an operation in which if one of them fails, the other fails too. How can I achieve this?


Solution

  • Transactions in databases are easy to group together because all operations are making the same thing: operating on data that has a well defined "previous" state which can be rolled back to.

    A "generic" atomic operation for "everything" can't be done for actions that have "side-effects" on the system. In your example itself, suppose both "create_user_record" and "create_directory" operations are on the file-systemm and "create_directory" fails? How is the Python runtime to know which was the file-system state before the fail, if these two operations are supposed to be atomically grouped?

    It is only possible if both function calls, in your case, are bound in an object which manages this external state, and which can then either perform an "undo" or just enqueue all actions and carry everything at once if things suceed. It is possible to create such a manager for file-system actions, for example - that would record every operation made, and prepare a "rolback" provision action in case a rollback is needed. But if you add further "side-effects" - either with I/O actions, or changing global program state (by calling methods on existing objects, changing the values of data structures or global variables), all of those actions need to be "managed" by the same entity.

    All in all: that is feasible, but you have to craft an special manager object, analogue to the database connection, and perform all changes that you want to be atomic by means of this object - either explicitly, or having the object implicitly shielding your side-effect operations - and the more domains you want to include in the "atomicity" (i.e. file I/O, global variable changes, network I/O using an specific API client already instantiated), the more complex your manager have to be.

    That said, such an object can be created, and the context-manager syntax of Python (the with statement block) is ideal to allow this. Here is a simple class which can manage access to global variables, and "undo" all changes to global variables in a similar way to a rollback:

    import inspect
    
    class GlobalVarsTransaction:
        def __enter__(self):
            self.previous_globals = inspect.currentframe().f_back.f_globals.copy()
        def __exit__(self, exc_type, exc_value, tb):
            if exc_type is None:
                # everything fine, allow all global vars set to persist!
                return
            # "rollback" global variables:
            globals = inspect.currentframe().f_back.f_globals
            new_keys = globals.keys() - self.previous_globals.keys()
            globals.update(self.previous_globals)
            for key in new_keys:
                del globals[key]
    

    Note that this is intended as example, and would cover few "real world" usages: in particular, it does not cover modification of a data structure in the global scope - i.e. if one changes elements inside a list, or dictionary bound to a variable in the current global scope that is not tracked in this manager.

    Here is a straightforward test function to showing the usage of the class above:

    def test():
        global a, b, c, d, e, f, g
        a = b = c = d = e = f = g = 0
        del g
    
        with GlobalVarsTransaction():
            a = 1
            b = 2
        assert a == 1 and b == 2
        try:
            c = 3
            raise RuntimeError()
            d = 4
        except RuntimeError:
            pass
        # partial state change took place
        assert c == 3 and d == 0
        try:
            with GlobalVarsTransaction():
                e = 5
                raise RuntimeError()
                f = 6
        except RuntimeError:
            pass
        assert e == 0 and f == 0
        try:
            with GlobalVarsTransaction():
                g = 7
                raise RuntimeError()
        except RuntimeError:
            pass
        try:
            g # should fail, as "g" is deleted in the beggining of the test
        except NameError:
            pass
        else:
            assert False, "variable created in failed transactions is set"
    test()
    

    All that said, it is interesting to note that should one choose to have a system that can perform atomic operations in different domains, it might be a good idea (or even impossible without,) to make use of a two phase commit protocol - and make Manager classes like the one in this example, separate for domain that would iplement the required steps for it.
    Should one go on that direction, the zope.transaction transaction package has production quality and has existed for decades, and can sure help one to implement that correctly