pythonprivatename-mangling

Instance attribute that has a name starting with two underscores is weirdly renamed


With the current implementation of my class when I try to get the value of a private attribute using class method I get None as the output. Any Ideas on where I'm going wrong?

Code

from abc import ABC, abstractmethod

class Search(ABC):
    @abstractmethod
    def search_products_by_name(self, name):
        print('found', name)


class Catalog(Search):
    def __init__(self):
        self.__product_names = {}
    
    def search_products_by_name(self, name):
        super().search_products_by_name(name)
        return self.__product_names.get(name)


x = Catalog()
x.__product_names = {'x': 1, 'y':2}
print(x.search_products_by_name('x'))

Solution

  • What's happening in this code?

    The code above seems fine, but has some behaviour that might seem unusual. If we type this in an interactive console:

    c = Catalog()
    # vars() returns the instance dict of an object,
    # showing us the value of all its attributes at this point in time.
    vars(c)
    

    Then the result is this:

    {'_Catalog__product_names': {}}
    

    That's pretty weird! In our class definition, we didn't give any attribute the name _Catalog__product_names. We named one attribute __product_names, but that attribute appears to have been renamed.

    What's going on

    This behaviour isn't a bug — it's actually a feature of python known as private name mangling. For all attributes that you define in a class definition, if the attribute name begins with two leading underscores — and does not end with two trailing underscores — then the attribute will be renamed like this. An attribute named __foo in class Bar will be renamed _Bar__foo; an attribute named __spam in class Breakfast will be renamed _Breakfast__spam; etc, etc.

    The name mangling only occurs for when you try to access the attribute from outside the class. Methods within the class can still access the attribute using its "private" name that you defined in __init__.

    Why would you ever want this?

    I personally have never found a use case for this feature, and am somewhat sceptical of it. Its main use cases are for situations where you want a method or an attribute to be privately accessible within a class, but not accessible by the same name to functions outside of the class or to other classes inheriting from this class.

    (N.B. The YouTube talk is from 2013, and the examples in the talk are written in python 2, so some of the syntax in the examples is a little different from modern python — print is still a statement rather than a function, etc.)

    Here is an illustration of how private name mangling works when using class inheritance:

    >>> class Foo:
    ...   def __init__(self):
    ...     self.__private_attribute = 'No one shall ever know'
    ...   def baz_foo(self):
    ...     print(self.__private_attribute)
    ...     
    >>> class Bar(Foo):
    ...   def baz_bar(self):
    ...     print(self.__private_attribute)
    ...     
    >>> 
    >>> b = Bar()
    >>> b.baz_foo()
    No one shall ever know
    >>> 
    >>> b.baz_bar()
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "<string>", line 3, in baz_bar
    AttributeError: 'Bar' object has no attribute '_Bar__private_attribute'
    >>>
    >>> vars(b)
    {'_Foo__private_attribute': 'No one shall ever know'}
    >>>
    >>> b._Foo__private_attribute
    'No one shall ever know'
    

    The methods defined in the base class Foo are able to access the private attribute using its private name that was defined in Foo. The methods defined in the subclass Bar, however, are only able to access the private attribute by using its mangled name; anything else leads to an exception.

    collections.OrderedDict is a good example of a class in the standard library that makes extensive use of name-mangling to ensure that subclasses of OrderedDict do not accidentally override certain methods in OrderedDict that are important to the way OrderedDict works.

    How do I fix this?

    The obvious solution here is to rename your attribute so that it only has a single leading underscore, like so. This still sends a clear signal to external users that this is a private attribute that should not be directly modified by functions or classes outside of the class, but does not lead to any weird name mangling behaviour:

    from abc import ABC, abstractmethod
    
    class Search(ABC):
        @abstractmethod
        def search_products_by_name(self, name):
            print('found', name)
    
    
    class Catalog(Search):
        def __init__(self):
            self._product_names = {}
        
        def search_products_by_name(self, name):
            super().search_products_by_name(name)
            return self._product_names.get(name)
    
    
    x = Catalog()
    x._product_names = {'x': 1, 'y':2}
    print(x.search_products_by_name('x'))
    

    Another solution is to roll with the name mangling, either like this:

    from abc import ABC, abstractmethod
    
    class Search(ABC):
        @abstractmethod
        def search_products_by_name(self, name):
            print('found', name)
    
    
    class Catalog(Search):
        def __init__(self):
            self.__product_names = {}
        
        def search_products_by_name(self, name):
            super().search_products_by_name(name)
            return self.__product_names.get(name)
    
    
    x = Catalog()
    # we have to use the mangled name when accessing it from outside the class
    x._Catalog__product_names = {'x': 1, 'y':2}
    print(x.search_products_by_name('x'))
    

    Or — and this is probably better, since it's just a bit weird to be accessing an attribute using its mangled name from outside the class — like this:

    from abc import ABC, abstractmethod
    
    class Search(ABC):
        @abstractmethod
        def search_products_by_name(self, name):
            print('found', name)
    
    
    class Catalog(Search):
        def __init__(self):
            self.__product_names = {}
        
        def search_products_by_name(self, name):
            super().search_products_by_name(name)
            return self.__product_names.get(name)
    
        def set_product_names(self, product_names):
            # we can still use the private name from within the class
            self.__product_names = product_names
    
    
    x = Catalog()
    x.set_product_names({'x': 1, 'y':2})
    print(x.search_products_by_name('x'))