I'm looking for some reasonable way to serialize namedtuples in YAML using PyYAML.
A few things I don't want to do:
Rely on a dynamic call to add a constructor/representor/resolver upon instantiation of the namedtuple. These YAML files may be stored and re-loaded later, so I cannot rely on the same runtime environment existing when they are restored.
Register the namedtuples in global.
Rely on the namedtuples having unique names
I was thinking of something along these lines:
class namedtuple(object):
def __new__(cls, *args, **kwargs):
x = collections.namedtuple(*args, **kwargs)
class New(x):
def __getstate__(self):
return {
"name": self.__class__.__name__,
"_fields": self._fields,
"values": self._asdict().values()
}
return New
def namedtuple_constructor(loader, node):
import IPython; IPython.embed()
value = loader.construct_scalar(node)
import re
pattern = re.compile(r'!!python/object/new:myapp.util\.')
yaml.add_implicit_resolver(u'!!myapp.util.namedtuple', pattern)
yaml.add_constructor(u'!!myapp.util.namedtuple', namedtuple_constructor)
Assuming this was in an application module at the path myapp/util.py
I'm not getting into the constructor, however, when I try to load:
from myapp.util import namedtuple
x = namedtuple('test', ['a', 'b'])
t = x(1,2)
dump = yaml.dump(t)
load = yaml.load(dump)
It will fail to find New in myapp.util.
I tried a variety of other approaches as well, this was just one that I thought might work best.
Disclaimer: Even once I get into the proper constructor I'm aware my spec will need further work regarding what arguments get saved how they are passed into the resulting object, but the first step for me is to get the YAML representation into my constructor function, then the rest should be easy.
I was able to solve my problem, though in a slightly less than ideal way.
My application now uses its own namedtuple
implementation; I copied the collections.namedtuple
source, created a base class for all new namedtuple
types to inherit, and modified the template (excerpts below for brevity, simply highlighting whats change from the namedtuple source).
class namedtupleBase(tuple):
pass
_class_template = '''\
class {typename}(namedtupleBase):
'{typename}({arg_list})'
One little change to the namedtuple function itself to add the new class into the namespace:
namespace = dict(_itemgetter=_itemgetter, __name__='namedtuple_%s' % typename,
OrderedDict=OrderedDict, _property=property, _tuple=tuple,
namedtupleBase=namedtupleBase)
Now registering a multi_representer
solves the problem:
def repr_namedtuples(dumper, data):
return dumper.represent_mapping(u"!namedtupleBase", {
"__name__": data.__class__.__name__,
"__dict__": collections.OrderedDict(
[(k, v) for k, v in data._asdict().items()])
})
def consruct_namedtuples(loader, node):
value = loader.construct_mapping(node)
cls_ = namedtuple(value['__name__'], value['__dict__'].keys())
return cls_(*value['__dict__'].values())
yaml.add_multi_representer(namedtupleBase, repr_namedtuples)
yaml.add_constructor("!namedtupleBase", consruct_namedtuples)
Hattip to Represent instance of different classes with the same base class in pyyaml for the inspiration behind the solution.
Would love an idea that doesn't require re-creating the namedtuple function, but this accomplished my goals.