pythonooppython-dataclasses

How to get attributes and their type from a dataclass?


I'd like to read all attributes and their types from a dataclass, as shown in this desired (pseudo)code:

from dataclasses import dataclass


@dataclass
class HelloWorld:
    name: str = 'Earth'
    is_planet: bool = True
    radius: int = 6371


if __name__ == '__main__':
    attrs = get_attributes(HelloWorld)
    for attr in attrs:
        print(attr.name, attr.type)  # name, str

I checked several answers, but couldn't find what I need yet.

Any idea?


Solution

  • For classes in general, you can access the __annotations__:

    >>> class Foo:
    ...    bar: int
    ...    baz: str
    ...
    >>> Foo.__annotations__
    {'bar': <class 'int'>, 'baz': <class 'str'>}
    

    This returns a dict mapping attribute name to annotation.

    However, dataclasses use dataclasses.Field objects to encapsulate a lot of this information. You can use dataclasses.fields on an instance or on the class:

    >>> import dataclasses
    >>> @dataclasses.dataclass
    ... class Foo:
    ...     bar: int
    ...     baz: str
    ...
    >>> dataclasses.fields(Foo)
    (Field(name='bar',type=<class 'int'>,default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), Field(name='baz',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD))
    

    NOTE:

    Starting in Python 3.7, the evaluation of annotations can be postponed:

    >>> from __future__ import annotations
    >>> class Foo:
    ...     bar: int
    ...     baz: str
    ...
    >>> Foo.__annotations__
    {'bar': 'int', 'baz': 'str'} 
    

    note, the annotation is kept as a string, this also affects dataclasses as well:

    >>> @dataclasses.dataclass
    ... class Foo:
    ...     bar: int
    ...     baz: str
    ...
    >>> dataclasses.fields(Foo)
    (Field(name='bar',type='int',default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD), Field(name='baz',type='str',default=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f806369bc10>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD))
    

    So, just be aware, since this will become the standard behavior, code you write should probably use the __future__ import and work under that assumption, because in Python 3.10, this will become the standard behavior.

    The motivation behind this behavior is that the following currently raises an error:

    >>> class Node:
    ...    def foo(self) -> Node:
    ...        return Node()
    ...
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 2, in Node
    NameError: name 'Node' is not defined
    

    But with the new behavior:

    >>> from __future__ import annotations
    >>> class Node:
    ...     def foo(self) -> Node:
    ...         return Node()
    ...
    >>>
    

    One way to handle this is to use the typing.get_type_hints, which I believe just basically eval's the type hints:

    >>> import typing
    >>> typing.get_type_hints(Node.foo)
    {'return': <class '__main__.Node'>}
    >>> class Foo:
    ...    bar: int
    ...    baz: str
    ...
    >>> Foo.__annotations__
    {'bar': 'int', 'baz': 'str'}
    >>> import typing
    >>> typing.get_type_hints(Foo)
    {'bar': <class 'int'>, 'baz': <class 'str'>}
    

    Not sure how reliable this function is, but basically, it handles getting the appropriate globals and locals of where the class was defined. So, consider:

    (py38) juanarrivillaga@Juan-Arrivillaga-MacBook-Pro ~ % cat test.py
    from __future__ import annotations
    
    import typing
    
    class Node:
        next: Node
    
    (py38) juanarrivillaga@Juan-Arrivillaga-MacBook-Pro ~ % python
    Python 3.8.5 (default, Sep  4 2020, 02:22:02)
    [Clang 10.0.0 ] :: Anaconda, Inc. on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import test
    >>> test.Node
    <class 'test.Node'>
    >>> import typing
    >>> typing.get_type_hints(test.Node)
    {'next': <class 'test.Node'>}
    

    Naively, you might try something like:

    >>> test.Node.__annotations__
    {'next': 'Node'}
    >>> eval(test.Node.__annotations__['next'])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1, in <module>
    NameError: name 'Node' is not defined
    

    You could hack together something like:

    >>> eval(test.Node.__annotations__['next'], vars(test))
    <class 'test.Node'>
    

    But it can get tricky