pythonyamlpyyamlnamedtuple

Serializing namedtuples via PyYAML


I'm looking for some reasonable way to serialize namedtuples in YAML using PyYAML.

A few things I don't want to do:

I was thinking of something along these lines:

class namedtuple(object):
    def __new__(cls, *args, **kwargs):
        x = collections.namedtuple(*args, **kwargs)

        class New(x):
            def __getstate__(self):
                return {
                    "name": self.__class__.__name__,
                    "_fields": self._fields,
                    "values": self._asdict().values()
                }
        return New

def namedtuple_constructor(loader, node):
    import IPython; IPython.embed()
    value = loader.construct_scalar(node)

import re
pattern = re.compile(r'!!python/object/new:myapp.util\.')
yaml.add_implicit_resolver(u'!!myapp.util.namedtuple', pattern)
yaml.add_constructor(u'!!myapp.util.namedtuple', namedtuple_constructor)

Assuming this was in an application module at the path myapp/util.py

I'm not getting into the constructor, however, when I try to load:

from myapp.util import namedtuple

x = namedtuple('test', ['a', 'b'])
t = x(1,2)
dump = yaml.dump(t)
load = yaml.load(dump)

It will fail to find New in myapp.util.

I tried a variety of other approaches as well, this was just one that I thought might work best.

Disclaimer: Even once I get into the proper constructor I'm aware my spec will need further work regarding what arguments get saved how they are passed into the resulting object, but the first step for me is to get the YAML representation into my constructor function, then the rest should be easy.


Solution

  • I was able to solve my problem, though in a slightly less than ideal way.

    My application now uses its own namedtuple implementation; I copied the collections.namedtuple source, created a base class for all new namedtuple types to inherit, and modified the template (excerpts below for brevity, simply highlighting whats change from the namedtuple source).

    class namedtupleBase(tuple): 
        pass
    
    _class_template = '''\
    class {typename}(namedtupleBase):
        '{typename}({arg_list})'
    

    One little change to the namedtuple function itself to add the new class into the namespace:

    namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename,
                     OrderedDict=OrderedDict, _property=property, _tuple=tuple,
                     namedtupleBase=namedtupleBase)
    

    Now registering a multi_representer solves the problem:

    def repr_namedtuples(dumper, data):
        return dumper.represent_mapping(u"!namedtupleBase", {
            "__name__": data.__class__.__name__,
            "__dict__": collections.OrderedDict(
                [(k, v) for k, v in data._asdict().items()])
        })
    
    def consruct_namedtuples(loader, node):
        value = loader.construct_mapping(node)
        cls_ = namedtuple(value['__name__'], value['__dict__'].keys())
        return cls_(*value['__dict__'].values())
    
    yaml.add_multi_representer(namedtupleBase, repr_namedtuples)
    yaml.add_constructor("!namedtupleBase", consruct_namedtuples)
    

    Hattip to Represent instance of different classes with the same base class in pyyaml for the inspiration behind the solution.

    Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.