pythonyaml

Read and dump [bracket, list] from and to yaml with python


I am trying to read and dump a list to yaml using the following code

with open(system_bsc_path) as f:
    system_bsc_dict = yaml.load(f)
with open(system_bsc_path, "w") as f:
    yaml.safe_dump(system_bsc_dict, f)

The input list, as in the file:

chs_per_cath: [[[10, 11, 12, 13], [13000, 13100, 13200, 13300]],
 [[16, 17, 18, 19, 20, 21, 22, 23, 24, 25], [13400, 13500, 13600, 13700, 13800, 13900, 14000, 14100, 14200, 14300]],
 [[32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [13400, 13500, 13600, 13700, 13800, 13900, 14000, 14100, 14200, 14300]]]

is read properly into python


The output that gets dumped:

chs_per_cath:
- - - 10
    - 11
    - 12
    - 13
  - - 13000
    - 13100
    - 13200
    - 13300
- - - 16
    - 17
    - 18
    - 19
    - 20
    - 21
    - 22
    - 23
    - 24
    - 25
  - - 13400
    - 13500
    - 13600
    - 13700
    - 13800
    - 13900
    - 14000
    - 14100
    - 14200
    - 14300
- - - 32
    - 33
    - 34
    - 35
    - 36
    - 37
    - 38
    - 39
    - 40
    - 41
  - - 13400
    - 13500
    - 13600
    - 13700
    - 13800
    - 13900
    - 14000
    - 14100
    - 14200
    - 14300

How can I get the same output as the input?


Solution

  • If you want to load, then dump (maybe after modifying some values), PyYAML is not the right tool, as it will mangle many things in the syntactic representation.

    It will drop flow style as you noticed, but also drop comments, anchor/alias names, specific integer formats (octal, hex, binary), etc.

    There is little control over the flow- vs block-style of the output in PyYAML. You can have all-block; node collections as flow,
    and you can have all-flow, using the default_flow_style parameter to safe_dump().

    You'd be better of using ruamel.yaml (disclaimer: I am the author of that library), as it supports the now 10 year old YAML 1.2 standard (where PyYAML only handles the outdated YAML 1.1) and will get you output which is much closer, and often identical to your YAML input.

    from ruamel.yaml import YAML
    
    yaml = YAML()
    with open(system_bsc_path) as f:
        system_bsc_dict = yaml.load(f)
    with open(system_bsc_path, "w") as f:
        yaml.dump(system_bsc_dict, f)
    

    If you are Python 3, you can use:

    from pathlib import Path
    yaml_file = Path(system_bsc_path)
    system_bsc_dict = yaml.load(yaml_file)
    yaml.dump(system_bsc_dict, yaml_file)
    

    By default any new lists (and dicts) will be block style, if you want to add a flow-style list, then you can use yaml.default_flow_style = True to set all those lists, or using fine control by setting the flow attribute (.fa) on the special internal representation:

    def FSlist(l):  # concert list into flow-style (default is block style)
        from ruamel.yaml.comments import CommentedSeq
        cs = CommentedSeq(l)
        cs.fa.set_flow_style()
        return cs
    
    system_bsc_dict['existing_field'] = FSlist(["Boston Maestro 4000"])