pythonruamel.yaml

How can I add a blank line between list items with Ruamel?


I am appending items to a list:

from ruamel.yaml import YAML

yaml = YAML()
yaml.preserve_quotes = True
yaml.allow_duplicate_keys = True
formatted_yaml = yaml.load(Path(my_file).open().read())

for item in my_items:
    # Just want a blank line here
    formatted_yaml.append(None)

    # Before adding this item
    formatted_yaml.append(item)

This produces:

- blah: 'sdfsdf'
  sdfsd: ''
- 
- blah: 'sdfsdf'
  sdfsd: ''
- 
- blah: 'sdfsdf'
  sdfsd: ''
- 
etc..

The output I want though is this:

- blah: 'sdfsdf'
  sdfsd: ''

- blah: 'sdfsdf'
  sdfsd: ''

- blah: 'sdfsdf'
  sdfsd: ''

etc..

Is it possible to just add an empty line like this?

I also tried adding a comment token like this:

ct = tokens.CommentToken("\n", error.CommentMark(0), None)
formatted_yaml.append(ct)

Which failed: ruamel.yaml.representer.RepresenterError: cannot represent an object: CommentToken('\n', col: 0)


Solution

  • If you add an element None to a list and then dump that list you get an additional sequence item. If that sequence is block style that is represent as

    -
    

    unless the null element is set to be represent in a different way in YAML ( e.g as NULL, or null, or ~). In general you cannot add objects to a Python list and hope that changes your output sequence in YAML without adding elements to the sequence. Adding a CommentToken will just try to add a sequence element with a representation of that CommentToken, and you would have to tell the parser beforehand how to represent that (hence the RepresenterError).

    As I have indicated so often here on StackOverflow, if you want to get some specific output, the FIRST STEP is to try and round-trip that expected output:

    import sys
    from pathlib import Path
    import ruamel.yaml
    
    yaml_str = """
    - blah: 'sdfsdf'
      sdfsd: ''
    
    - blah: 'sdfsdf'
      sdfsd: ''
    
    - blah: 'sdfsdf'
      sdfsd: ''
    """
    
    yaml = ruamel.yaml.YAML()
    yaml.preserve_quotes = True
    data = yaml.load(yaml_str)
    yaml.dump(data, sys.stdout)
    

    which gives:

    - blah: 'sdfsdf'
      sdfsd: ''
    
    - blah: 'sdfsdf'
      sdfsd: ''
    
    - blah: 'sdfsdf'
      sdfsd: ''
    

    So the empty lines are preserved. As a second step inspect data, using type() and dir() and possible inspecting the source code (or other questions on SO):

    print(data[0].ca)
    print(type(data[0].ca.items['sdfsd'][2].start_mark))
    

    showing:

    Comment(comment=None,
      items={'sdfsd': [None, None, CommentToken('\n\n', line: 3, col: 0), None]})
    <class 'ruamel.yaml.error.StringMark'>
    

    Thus the mapping that is the first element of the root level sequence, gets loaded and assigned a Comment that has \n\n as a CommentToken associated with the last key. If you would have an end-of-line comment behind the sdfsd: '' line you would see this shows up before the first \n and if the empty line would have a comment that would show up before the second `\n'.

    Adding a newline before an additional item, can therefore be accomplished by adding a Comment to the preceding element.

    last_element = data[-1]
    last_key_of_last_element = list(last_element.keys())[-1]
    # c = last_element.ca
    #c.items[last_key_of_last_element] = [None, None, ruamel.yaml.tokens.CommentToken('\n\n', column=1), None]
    # print(last_element.ca)
    last_element.ca.items[last_key_of_last_element] = data[0].ca.items['sdfsd']
    data.append('and one more')
    yaml.dump(data, sys.stdout)
    

    showing:

    - blah: 'sdfsdf'
      sdfsd: ''
    
    - blah: 'sdfsdf'
      sdfsd: ''
    
    - blah: 'sdfsdf'
      sdfsd: ''
    
    - and one more
    

    It should be clear that all of this relies on ruamel.yaml internals that can change without notice. So pin the version that you are using.

    Apart from that, you should replace formatted_yaml = yaml.load(Path(my_file).open().read()) with formatted_yaml = yaml.load(Path(my_file)). That will open the file in binary mode letting the library deal with UTF-8 (and other unicode encodingins).