pythonvariablessingletonmodulescope

How to create module-wide variables in Python?


Is there a way to set up a global variable inside of a module? When I tried to do it the most obvious way as appears below, the Python interpreter said the variable __DBNAME__ did not exist.

...
__DBNAME__ = None

def initDB(name):
    if not __DBNAME__:
        __DBNAME__ = name
    else:
        raise RuntimeError("Database name has already been set.")
...

And after importing the module in a different file

...
import mymodule
mymodule.initDB('mydb.sqlite')
...

And the traceback was:

UnboundLocalError: local variable '__DBNAME__' referenced before assignment

Any ideas? I'm trying to set up a singleton by using a module, as per this fellow's recommendation.


Solution

  • Here is what is going on.

    First, the only global variables Python really has are module-scoped variables. You cannot make a variable that is truly global; all you can do is make a variable in a particular scope. (If you make a variable inside the Python interpreter, and then import other modules, your variable is in the outermost scope and thus global within your Python session.)

    All you have to do to make a module-global variable is just assign to a name.

    Imagine a file called foo.py, containing this single line:

    X = 1
    

    Now imagine you import it.

    import foo
    print(foo.X)  # prints 1
    

    However, let's suppose you want to use one of your module-scope variables as a global inside a function, as in your example. Python's default is to assume that function variables are local. You simply add a global declaration in your function, before you try to use the global.

    def initDB(name):
        global __DBNAME__  # add this line!
        if __DBNAME__ is None: # see notes below; explicit test for None
            __DBNAME__ = name
        else:
            raise RuntimeError("Database name has already been set.")
    

    By the way, for this example, the simple if not __DBNAME__ test is adequate, because any string value other than an empty string will evaluate true, so any actual database name will evaluate true. But for variables that might contain a number value that might be 0, you can't just say if not variablename; in that case, you should explicitly test for None using the is operator. I modified the example to add an explicit None test. The explicit test for None is never wrong, so I default to using it.

    Finally, as others have noted on this page, two leading underscores signals to Python that you want the variable to be "private" to the module. If you ever do an from mymodule import *, Python will not import names with two leading underscores into your name space. But if you just do a simple import mymodule and then say dir(mymodule) you will see the "private" variables in the list, and if you explicitly refer to mymodule.__DBNAME__ Python won't care, it will just let you refer to it. The double leading underscores are a major clue to users of your module that you don't want them rebinding that name to some value of their own.

    It is considered best practice in Python not to do import *, but to minimize the coupling and maximize explicitness by either using mymodule.something or by explicitly doing an import like from mymodule import something.

    EDIT: If, for some reason, you need to do something like this in a very old version of Python that doesn't have the global keyword, there is an easy workaround. Instead of setting a module global variable directly, use a mutable type at the module global level, and store your values inside it.

    In your functions, the global variable name will be read-only; you won't be able to rebind the actual global variable name. (If you assign to that variable name inside your function it will only affect the local variable name inside the function.) But you can use that local variable name to access the actual global object, and store data inside it.

    You can use a list but your code will be ugly:

    __DBNAME__ = [None] # use length-1 list as a mutable
    
    # later, in code:  
    if __DBNAME__[0] is None:
        __DBNAME__[0] = name
    

    A dict is better. But the most convenient is a class instance, and you can just use a trivial class:

    class Box:
        pass
    
    __m = Box()  # m will contain all module-level values
    __m.dbname = None  # database name global in module
    
    # later, in code:
    if __m.dbname is None:
        __m.dbname = name
    

    (You don't really need to capitalize the database name variable.)

    I like the syntactic sugar of just using __m.dbname rather than __m["DBNAME"]; it seems the most convenient solution in my opinion. But the dict solution works fine also.

    With a dict you can use any hashable value as a key, but when you are happy with names that are valid identifiers, you can use a trivial class like Box in the above.