pythonmetaclass

Python metaclass keyword arguments not getting used by subclass


I am trying to write a metaclass to assist in serialization. The intention of the metaclass was to isolate the production code from the serialization mode (e.g. YAML or JSON) as much as possible. So, things could inherit from the class Serializable, and not have to worry (too much) about whether it was going to be serialized as YAML or JSON.

I have the following code (for a YAML serializer). It is basically working, but I wanted to specify the Loader and Dumper as Keyword argument (since PyYaml could use SafeLoader, FullLoader, etc.) This is where I have the problem. I added keyword arguments for Loader and Dumper to the Serializable class. These work for that class, but not for subclasses, e.g. when I define the class A (which is a subclass of Serializable) the meteclass's __new__ does not get any keyword arguments.

What am I doing wrong?

import yaml

class SerializableMeta(type):
    @classmethod
    def __prepare__(cls, name, bases, **kwargs):
        return {"yaml_tag": None, "to_yaml": None, "from_yaml": None}

    def __new__(cls, name, bases, namespace, **kwargs):
        yaml_tag = f"!{name}"
        cls_ = super().__new__(cls, name, bases, namespace)
        cls_.yaml_tag = yaml_tag
        cls_.to_yaml = lambda dumper, obj: dumper.represent_mapping(yaml_tag, obj.__dict__)
        cls_.from_yaml = lambda loader, node: cls_(**loader.construct_mapping(node))
        kwargs["Dumper"].add_representer(cls_, cls_.to_yaml)
        kwargs["Loader"].add_constructor(yaml_tag, cls_.from_yaml)
        return cls_


class Serializable(metaclass=SerializableMeta, Dumper=yaml.Dumper, Loader=yaml.Loader):
    pass


class A(Serializable):
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __repr__(self):
        return f"A({self.a}, {self.b})"

    def __eq__(self, other):
        if isinstance(other, A):
            return self.a == other.a and self.b == other.b
        else:
            return NotImplemented

Solution

  • That is it - the keyword-mechanism for class creation doesn't imply that subclasses will be using the same keywords - Those keywords (which are forwarded to the metaclass' __new__, __init__ and to superclasses __init_subclass__ special methods) need to be declared by each class as it is created - either by the using the class statement, or by calling the metaclass directly.

    A workaround, however is easy - after all, the metaclass __new__ method is "king" when things come to class creation. In this case, the most straightforward path seems to be having the metaclass annotate in the newly created classes the keyword arguments used, if any - so that when a subclass is created, the __mro__ can be checked for those values, as a normal attribute lookup.

    import yaml
    
    class SerializableMeta(type):
        @classmethod
        def __prepare__(mcls, name, bases, **kwargs):
            return {"yaml_tag": None, "to_yaml": None, "from_yaml": None}
    
        def __new__(mcls, name, bases, namespace, **kwargs):
            yaml_tag = f"!{name}"
            cls_ = super().__new__(mcls, name, bases, namespace)
            cls_.yaml_tag = yaml_tag
            cls_.to_yaml = lambda dumper, obj: dumper.represent_mapping(yaml_tag, obj.__dict__)
            cls_.from_yaml = lambda loader, node: cls_(**loader.construct_mapping(node))
            # Ensure relevant kwargs passed are embedded into the class. 
            for attr in ("Dumper", "Loader"):
                if attr in kwargs:
                    setattr(cls_, f"_yaml_{attr}", kwargs[attr])
            # Subclasses can them retrieve those kwargs by normal attribute lookup,
            # if none were passed to then directly:  
            cls_._yaml_Dumper.add_representer(cls_, cls_.to_yaml)
            cls_._yaml_Loader.add_constructor(yaml_tag, cls_.from_yaml)
            return cls_
        
        ...
    
    
    

    If for some reason you don't want those to be exposed (although marked as for private used with the _ prefix) in the final subclass object, they could be stored in a dictionary created in the metaclass itself that would work as a "registry" - but them it is a bit more code, re-implementing some of the attribute lookup logic. I don't think that is needed.

    As a side note, keep in mind "class methods" in the metaclass itself get the metaclass as first attribute - therefore that can be named "mcls" (or "metaclass" or "mclass" - there is no convention there) to make then distinct of the class being created which will be "cls" . I kept your "cls_" name in the code above, but made the change to "mcls" to be clear what is the metaclass itself. The sameway, what in an ordinary class would be the self attribute can be written as cls in a metaclass (for example, if you are overriding the metaclass' __call__ method).