Preserve line wrapping in ruamel.yaml

53 views Asked by At

Is there a way to preserve line wrapping with ruamel.yaml? The closest I have found is to use the width property but that doesn't quite do it.

Prevent long lines getting wrapped in ruamel.yaml

My use case: I have a yaml file with some long lines and some where they are already wrapped. To minimize changes when doing a round-trip update, I'd like the untouched lines to keep their wrapping.

import ruamel.yaml
import sys

yaml = ruamel.yaml.YAML()
# yaml.width = 999

instr = """\
description: This is a long line that has been
  wrapped.
url: https://long-url.com/add-more-characters-so-that-it-goes-out-farther-than-the-default-80-cols
value: 7
"""

test = yaml.load(instr)

# Modify <value>
test['value'] += 1

yaml.dump(test, sys.stdout)

Output below. The description is unwrapped and url is moved to a new line.

description: This is a long line that has been wrapped.
url: 
  https://long-url.com/add-more-characters-so-that-it-goes-out-farther-than-the-default-80-cols
value: 8

If I uncomment the yaml.width = 999 line the url looks the way I want, but description changes.

description: This is a long line that has been wrapped.
url: https://long-url.com/add-more-characters-so-that-it-goes-out-farther-than-the-default-80-cols
value: 8

What I really want is the description and url lines to match the original, only value changing:

description: This is a long line that has been
  wrapped.
url: https://long-url.com/add-more-characters-so-that-it-goes-out-farther-than-the-default-80-cols
value: 8
2

There are 2 answers

1
bunji On

I think that the issue here is that the yaml specification will always ignore newlines in flow scalars (ie. not block scalars).

From the specification (see Examples 2.17 and 2.18):

All flow scalars can span multiple lines; line breaks are always folded.

Because line breaks are always folded with flow scalars, ruamel.yaml will ignore your newline in the description: field when the yaml string is loaded. Then when it comes time to dump the result, the information is already lost.

If you want yaml to preserve newlines in the output you need to use the block scalar format (see section 2.3 in the specification).

If your initial yaml document looks like this instead (with block scalars):

description: | 
  This is a long line that has been
  wrapped.
url: https://long-url.com/add-more-characters-so-that-it-goes-out-farther-than-the-default-80-cols
value: 7

Then the library will recognize the line break when it is loaded and so will preserve it when it is dumped later on.

EDIT

As @Anthon points out in their comment/answer using the >- syntax will get you a string that is closest to the one in your original document. See the specification links I've provided for more information on the difference between | and >-.

0
Anthon On

ruamel.yaml doesn't try to preserve "soft" newlines in plain or quoted scalars. It currently only tries to do so in block style folded scalars.

Since for the argument for key url you need to set the linelength to a large enough number otherwise the value will not fit the line and be pushed to the next line (where it also doesn't fit, but with 2 positions less sticking out, and yes the algorithm could be smarter). So there is no way to get what you want with your input.

What should be preserved on round-trip and have the same value for key description is the using the folded scalar.

description: >-
  This is a long line that has been
  wrapped.
url: https://long-url.com/add-more-characters-so-that-it-goes-out-farther-than-the-default-80-cols
value: 7