python

Underscore vs Double underscore with variables and methods


Somebody was nice enough to explain to me that __method() mangles but instead of bothering him further since there are a lot of other people who need help I was wondering if somebody could elaborate the differences further.

For example I don't need mangling but does _ stay private so somebody couldn't do instance._method()? Or does it just keep it from overwriting another variable by making it unique? I don't need my internal methods "hidden" but since they are specific to use I don't want them being used outside of the class.


Solution

  • From PEP 8:

    • _single_leading_underscore: weak "internal use" indicator. E.g.

    from M import *

    does not import objects whose name starts with an underscore.

    Tkinter.Toplevel(master, class_='ClassName')

    • __double_leading_underscore: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).

    Also, from David Goodger's Code Like a Pythonista:

    Attributes: interface, _internal, __private

    But try to avoid the __private form. I never use it. Trust me. If you use it, you WILL regret it later.

    Explanation:

    People coming from a C++/Java background are especially prone to overusing/misusing this "feature". But __private names don't work the same way as in Java or C++. They just trigger a name mangling whose purpose is to prevent accidental namespace collisions in subclasses: MyClass.__private just becomes MyClass._MyClass__private. (Note that even this breaks down for subclasses with the same name as the superclass, e.g. subclasses in different modules.) It is possible to access __private names from outside their class, just inconvenient and fragile (it adds a dependency on the exact name of the superclass).

    The problem is that the author of a class may legitimately think "this attribute/method name should be private, only accessible from within this class definition" and use the __private convention. But later on, a user of that class may make a subclass that legitimately needs access to that name. So either the superclass has to be modified (which may be difficult or impossible), or the subclass code has to use manually mangled names (which is ugly and fragile at best).

    There's a concept in Python: "we're all consenting adults here". If you use the __private form, who are you protecting the attribute from? It's the responsibility of subclasses to use attributes from superclasses properly, and it's the responsibility of superclasses to document their attributes properly.

    It's better to use the single-leading-underscore convention, _internal. "This isn't name mangled at all; it just indicates to others to "be careful with this, it's an internal implementation detail; don't touch it if you don't fully understand it". It's only a convention though.