pythonyamlruamel.yaml

How to maintain original order of attributes when using ruamel yaml.dump?


I am using a pydantic model to store some data. The model itself is not relevant, the relevant part is that the pydantic model.dump_to_json() gives me a string like this:

str_val = '{"aspect":{"name":"tuskhelm_of_joritz_the_mighty"},"affix":[{"name":"maximum_life"},{"name":"intelligence"}]}'

Note that "aspect" is listed first. However, whenever I try to write this out to a yaml file, affix is always put first. This is for a human readable/modifiable file and I really like how ruamble's YAML writes out the file, but is there any way for me to ensure that aspect will always be written out first?

Here is some sample code:

import sys

from ruamel.yaml import YAML


def main():
    str_val = '{"aspect":{"name":"tuskhelm_of_joritz_the_mighty"},"affix":[{"name":"maximum_life"},{"name":"intelligence"}]}'
    yaml = YAML(typ="safe", pure=True)
    dict_val = yaml.load(str_val)
    print("Dictionary value: " + str(dict_val))
    yaml.dump(dict_val, sys.stdout)


if __name__ == "__main__":
    main()

And here is the output of this code:

Dictionary value: {'aspect': {'name': 'tuskhelm_of_joritz_the_mighty'}, 'affix': [{'name': 'maximum_life'}, {'name': 'intelligence'}]}
affix:
- {name: maximum_life}
- {name: intelligence}
aspect: {name: tuskhelm_of_joritz_the_mighty}

Note that even though the dictionary is in the proper order, upon dumping aspect always comes out second.

I've also tried other means of writing out the yaml (including using the default loader instead of safe) and though I can get the order to be preserved, what's given is not (in my opinion) as human readable. If there's a way to format a different yaml writer to write out like this I'd be fine with that too.


Solution

  • The safe loader doesn't preserve the order of the keys of a mapping, neither in pure mode or when using the C extension. It essentially follows the YAML specification, which says keys are unordered, and the behaviour of PyYAML from which it was forked, so the dumper activily sorts the output keys.

    If you leave out the typ="safe," (the default loader currently is pure only), the order is preserved as well as the flow-style of the original. The flow-style information is attached to the loaded dictionaries/lists (actually to subclass instances of these), so using a different dumper instance is not going to help, as the default dumper recognises this flow-style information, and dumps the original flow style. The safe dumper would ignore that information and sort the keys, if it could handle the dict/list subclasses, so that is not going to be of any use in this case.

    The thing to do is to recursively remove the flow-style information from the collection instances, and set the default_flow_style attribute of the YAML instance (which defaults to False):

    import sys
    
    from ruamel.yaml import YAML
    
    def rm_style_info(d):
        if isinstance(d, dict):
            d.fa._flow_style = None
            for k, v in d.items():
                rm_style_info(k)
                rm_style_info(v)
        elif isinstance(d, list):
            d.fa._flow_style = None
            for elem in d:
                rm_style_info(elem)
    
    def main():
        str_val = '{"aspect":{"name":"tuskhelm_of_joritz_the_mighty"},"affix":[{"name":"maximum_life"},{"name":"intelligence"}]}'
        yaml = YAML()
        yaml.default_flow_style = None
        dict_val = yaml.load(str_val)
        rm_style_info(dict_val)
        print("Dictionary value: " + str(dict_val))
        yaml.dump(dict_val, sys.stdout)
    
    
    if __name__ == "__main__":
        main()
    

    which gives:

    Dictionary value: {'aspect': {'name': 'tuskhelm_of_joritz_the_mighty'}, 'affix': [{'name': 'maximum_life'}, {'name': 'intelligence'}]}
    aspect: {name: tuskhelm_of_joritz_the_mighty}
    affix:
    - {name: maximum_life}
    - {name: intelligence}
    

    There is currently no API to unset the flow attribute (.fa), so make sure to pin the version of ruamel.yaml you are using, and test before updating the version number.