debugginglogging

Reporting information during code execution: best design


I always had doubts when it comes to designing proper report of execution.

Say you have the following (stupid, to be simple) case. I will use python.

def doStuff():
    doStep1()
    doStep2()
    doStep3()

Now, suppose you want to give a report of the various steps, if something goes wrong etc. Not really debug: just informative behavior of the application.

A first, easy solution is to put prints

def doStuff():
    print "starting doing stuff"
    print "I am starting to do step 1"
    doStep1()
    print "I did step 1"
    print "I am starting to do step 2"
    doStep2()
    print "I did step 2"
    print "I am starting to do step 3"
    doStep3()
    print "I did step 3"

In general, this is quite bad. Suppose that this code is going to end up in a library. I would not expect my library to print stuff out. I would expect it to do the job silently. Still, sometimes I would like to provide information, not only in debug situations, but also to keep the user informed that something is actually in the process of being done. Print is also bad because you don't have control of the handling of your messages. it just goes to stdout, and there's nothing you can do about it, except redirection.

Another solution is to have a module for logging.

def doStuff():
    Logging.log("starting doing stuff")
    Logging.log("I am starting to do step 1")
    doStep1()
    Logging.log("I did step 1")
    Logging.log("I am starting to do step 2")
    doStep2()
    Logging.log("I did step 2")
    Logging.log("I am starting to do step 3")
    doStep3()
    Logging.log("I did step 3")

This has the advantage that you sort of know a unique place for your logging service, and you can tinker this service as much as you want. You can silence it, redirect it onto a file, to stdout, or even to a network. Disadvantage is that you get a very strong coupling with the Logging module. Basically every part of your code depends on it, and you have calls to logging everywhere.

The third option is to have a report object with a clear interface, and you pass it around

def doStuff(reporter=NullReporter()):
    reporter.log("starting doing stuff")
    reporter.log("I am starting to do step 1")
    doStep1()
    reporter.log("I did step 1")
    reporter.log("I am starting to do step 2")
    doStep2()
    reporter.log("I did step 2")
    reporter.log("I am starting to do step 3")
    doStep3()
    reporter.log("I did step 3")

Eventually, you can also pass the reporter object to doStepX() if they have more to say. Advantage: it reduces coupling with a module, but it introduces coupling with the instantiation of the NullReporter object. This can be solved by using None as default and checking before calling log, which is clumsy, because in python you have to write a conditional every time (in C you could define a macro)

def doStuff(reporter=None):
    if reporter is not None:
        reporter.log("starting doing stuff")
        # etc...

Edit: Another option is to work Qt-like, and have an emit() signal strategy. As your code executes, it emits information with proper status codes, and anyone interested can subscribe to the signals and provide info. Nice and clean, very decoupled, but requires a little of coding, as I don't think this can quickly be done with the python battery included.

Finally, you can raise exceptions with a meaningful error message, but this of course can be used only if you are exiting from an error condition. it does not work for occasional reporting.

Edit: I'd like to clarify the fact that the situation is more general, and not limited just to a sequence of invoked steps. it could also involve control structures:

 if disconnected:
     print "Trying to connect"
     connect()
 else:
     print "obtaining list of files from remote host"
     getRemoteList()

The report could also be into the actual routines, so you would have a "print" in the connect() and getRemoteList() routines as a first statement.

The question therefore are:


Edit: more thoughts for the mind

I think it's not only a matter of decoupling the Logging code from the logic code. I think it's also a matter of decoupling the information production from the information consumption. Similar techniques already exist, in particular to handle UI events, but I don't really see the same patterns applied to the logging problem.


Edit: I accepted the answer from Marcelo because he points out at the factual evidence that a compromise is the best solution in this case, and there's no silver bullet.


Solution

  • I think there is a point where you must draw a line and make a compromise. I see no way to completely decouple logging from the system because you have to send those messages somewhere and in a way that someone understands.

    I would go with the default logging module, because... it's the default module. It's well documented and comes with the default library, so no dependency issues here. Also, you save yourself from reinventing the wheel.

    That said, if you are really into doing something new, you could make a global reporter object. You can instantiate and configure it at the beginning of your process (logging, no logging, redirecting streams, etc. Even in a per process/function/step basis) and call it from everywhere, no need to pass it around (maybe in a multi threaded environment, but that would be minimal).

    You can also put it inside another thread and catch log events a la Qt.