I am using ruamel.yaml version 0.18.5 and python 3.10.4
My problem centres around adding (and removing) comments from yaml files. In particular I want to ensure that the end of every top-level block of data has a newline after it, and the same for every block that conforms to the same schema.
Consider for example the following:
import ruamel.yaml
import sys
from pathlib import Path
from ruamel.yaml import YAML
yaml = YAML()
yaml.preserve_quotes = True
yaml.encoding = True
yaml.indent(mapping=2, sequence=4, offset=2)
yaml.width = 120
yaml_test_data = """
name: My Application
description: >
A multiline description of what my app does
bcp:
tier: 2
rto: PT1H
rpo: PT1H
environment:
- name: EMEA
host: [abc*]
- name: APAC
host: [def*]
data:
class: [INT, PUB]
categories: [REF, MKT]
"""
testdata = yaml.load(yaml_test_data)
Now, I want to do the following:
- remove the newline after the "rpo" item
- add another item to the bcp block
- add a comment after this new item
- insert a comment after the first item in the environment block
To test what's possible I've tried to use the yaml_set_comment_before_after_key() function. As the author freely admits, the documentation isn't complete so I'm not sure if this is the right approach.
testdata['bcp'].yaml_set_comment_before_after_key(key='rpo', after="")
testdata['bcp']['new_item'] = "new data"
testdata['bcp'].yaml_set_comment_before_after_key(key='new_item',
before="This is a comment before 'new_item' key",
after="This is a comment after 'new_item' key",
)
testdata.yaml_set_comment_before_after_key(key='name',
before="comment before key 'name'",
after="comment after key 'name'")
testdata['environment'][0].yaml_set_comment_before_after_key(key='host',
after="comment after 1st host")
When I inspect the CommentMap associated with 'bcp' via testdata['bcp'].ca:
I now get this:
Comment(
start=None,
items={
rpo: [None, None, CommentToken('\n\n', line: 8, col: 7), None]
new_item: [None, [CommentToken("# This is a comment before 'new_item' key\n", col: 0)], None, [CommentToken("# This is a comment after 'new_item' key\n", col: 2)]]
})
The comment has not been removed from the 'rpo' item, but it looks like the before and after comments I added to the 'new_item' are in place.
However, when dumping it to the screen via yaml.dump(testdata, sys.stdout) the 'after' comments aren't there in either the 'bcp' block or the 'name' field and it also splits the data for the 1st 'host' field:
# comment before key 'name'
name: My Application
description: >
A multiline description of what my app does
bcp:
tier: 2
rto: PT1H
rpo: PT1H
# This is a comment before 'new_item' key
new_item: new data
environment:
- name: EMEA
host:
# comment after 1st host
[abc*]
- name: APAC
host: [def*]
data:
class: [INT, PUB]
categories: [REF, MKT]
This is a simplified example for the 100s of yaml files I want to update. I want to enforce some better formatting by checking/removing comments where necessary and adding a newline after each (top-level) block of data.
What do I need to do to get the code working?
I first repeat here two things I have written in other answers:
ruamel.yamlthat you are using.You don't indicate you did step 2, so lets try that first ( I hope I got the expected output correct ).
which gives:
And since that looks like the input, it should be possible to construct that from the data loaded to your input.
You indicate you analysed both the input and the data updated with your changes, but you probably didn't look at the data loaded from the expected output, so lets do that, but keep in mind that accessing the
.caattribute on a loaded collection node (sequence, mapping), will create an (empty) comment if there previously was no comment:which gives:
Lets start with the last one (
3:). As you can see there is no comment string there. The reason for this is thatruamel.yamltends to gather a comment and attach it to the next node it fully parses, in this case that is the second item (index1) of the sequence that is the value for keyenvironment:which gives:
So you should attach your comment to
data['environment'], but you probably also can attach it to the list that is the value forhost. That your code attaches the comment between the keyhost, and its value is because of the value not being a scalar.You cannot assign an empty comment as that will give you an empty line. Instead delete the comment entry altogether.
Starting with your input:
which gives:
I suggest you make some wrapper routines, so that when the internals change, you "only" have to update those.
Other things to consider:
yaml.encodingtoTrue, it defaults touft-8and you should probably leave it at that.import ruamel.yamlis superfluous, since you only useYAMLfromfrom ruamel.yaml.import YAML