pythonpython-typingtypeddict

Get keys of nested TypedDicts


Having the following TypedDict:

class MySubClass(TypedDict):
    name: str
    color: str

class MyClass(TypedDict):
    x: MySubClass
    y: str

What is the function, that can extract the keys recursively like this:

[x_name, x_color, y]

The function should be dynamic so that it can extract all kinds of nested structures, but one level of nesting is already enough.

Many Thanks!


Solution

  • Python >=3.10

    For Python >=3.10 we have the typing.is_typeddict function that we can use to check, if something is in fact a TypedDict subtype.

    We can use typing.get_type_hints (Python >=3.9) on a TypedDict class to get the keys and corresponding types defined on it. (This is likely better that using its __annotations__ dictionary directly as pointed out by @chepner in a comment.)

    A simple recursive function to get the keys the way you wanted might look like this:

    from typing import is_typeddict
    
    
    def get_typed_dict_keys(cls: type) -> list[str]:
        keys: list[str] = []
        for key, type_ in get_type_hints(cls).items():
            if is_typeddict(type_):
                keys.extend(
                    f"{key}_{sub_key}"
                    for sub_key in get_typed_dict_keys(type_)
                )
            else:
                keys.append(key)
        return keys
    

    Demo:

    from typing import TypedDict
    
    
    class Foo(TypedDict):
        name: str
        color: str
    
    
    class Bar(TypedDict):
        x: Foo
        y: str
    
    
    class Baz(TypedDict):
        a: int
        b: Bar
    
    
    print(get_typed_dict_keys(Baz))
    

    Output: ['a', 'b_x_name', 'b_x_color', 'b_y']

    Obviously, you might run into name collisions with the way the nested keys are formed. But I am sure you are aware of that.

    Side note: If you are on Python >=3.11 and have some keys that are annotated with NotRequired, this solution should still work because get_type_hints resolves those annotations to the underlying type.

    E.g. the following Baz class:

    from typing import TypedDict, NotRequired
    
    ...
    
    class Baz(TypedDict):
        a: int
        b: NotRequired[Bar]
    

    The function would still work and return the same output.


    Python 3.9

    Here we need to get creative because is_typeddict is not available to us. To shamelessly steal bits from Pydantic, we could simply check if something is 1) a dict subtype and 2) has the typical TypedDict class attributes. This is obviously less reliable, but should work well enough in most cases:

    from typing import get_type_hints
    
    
    def is_typeddict(cls: type) -> bool:
        if not (isinstance(cls, type) and issubclass(cls, dict)):
            return False
        return hasattr(cls, "__total__") and hasattr(cls, "__required_keys__")
    
    
    def get_typed_dict_keys(cls: type) -> list[str]:
        keys: list[str] = []
        for key, type_ in get_type_hints(cls).items():
            if is_typeddict(type_):
                keys.extend(
                    f"{key}_{sub_key}"
                    for sub_key in get_typed_dict_keys(type_)
                )
            else:
                keys.append(key)
        return keys
    

    Same Demo, same output.


    Python 3.8

    Without typing.get_type_hints, we can just use replace that call in the get_typed_dict_keys function with cls.__annotations__.

    Also, the TypedDict.__required_keys__ class attribute was only added in Python 3.9, so to see if something is a TypedDict, we can only check for __total__. This is obviously even less robust, but the best we can do with Python 3.8.

    With type annotations etc. adjusted properly, the 3.8 code would look like this:

    from typing import Any, List, Type
    
    
    def is_typeddict(cls: Type[Any]) -> bool:
        if not (isinstance(cls, type) and issubclass(cls, dict)):
            return False
        return hasattr(cls, "__total__")
    
    
    def get_typed_dict_keys(cls: Type[Any]) -> List[str]:
        keys: List[str] = []
        for key, type_ in cls.__annotations__.items():
            if is_typeddict(type_):
                keys.extend(
                    f"{key}_{sub_key}"
                    for sub_key in get_typed_dict_keys(type_)
                )
            else:
                keys.append(key)
        return keys
    

    Same Demo, same output.


    PS, just for fun

    Here is a function that can return a nested dictionary of annotations of nested TypedDicts and optionally flatten it, just because:

    from typing import Any, get_type_hints, is_typeddict
    
    
    def get_typed_dict_annotations(
        cls: type,
        recursive: bool = False,
        flatten: bool = False,
    ) -> dict[str, Any]:
        if not recursive:
            return get_type_hints(cls)
        output: dict[str, Any] = {}
        for key, type_ in get_type_hints(cls).items():
            if not is_typeddict(type_):
                output[key] = type_
                continue
            sub_dict = get_typed_dict_annotations(
                type_,
                recursive=recursive,
                flatten=flatten,
            )
            if flatten:
                for sub_key, sub_type in sub_dict.items():
                    output[f"{key}_{sub_key}"] = sub_type
            else:
                output[key] = sub_dict
        return output
    

    Demo:

    from pprint import pprint
    from typing import TypedDict
    
    
    class Foo(TypedDict):
        name: str
        color: str
    
    
    class Bar(TypedDict):
        x: Foo
        y: str
    
    
    class Baz(TypedDict):
        a: int
        b: Bar
    
    
    baz_annotations = get_typed_dict_annotations(Baz, recursive=True, flatten=True)
    pprint(baz_annotations, sort_dicts=False)
    
    print(list(baz_annotations.keys()))
    
    baz_annotations = get_typed_dict_annotations(Baz, recursive=True)
    pprint(baz_annotations, sort_dicts=False)
    

    Output:

    {'a': <class 'int'>,
     'b_x_name': <class 'str'>,
     'b_x_color': <class 'str'>,
     'b_y': <class 'str'>}
    
    ['a', 'b_x_name', 'b_x_color', 'b_y']
    
    {'a': <class 'int'>,
     'b': {'x': {'name': <class 'str'>, 'color': <class 'str'>},
           'y': <class 'str'>}}