Parsing YAML containing multiple documents

175 views Asked by At

I have YAML file (Unity YAML) which contains multiple documents:

%YAML 1.1
%TAG !u! tag:unity3d.com,2011:
--- !u!1 &500710038956159860
GameObject:
  m_ObjectHideFlags: 0
  m_CorrespondingSourceObject: {fileID: 0}
--- !u!4 &7386865245216119872
Transform:
  m_ObjectHideFlags: 0
  m_CorrespondingSourceObject: {fileID: 0}

yaml-cpp parse seems incomplete:

YAML::Node map = YAML::Load(loaded_map);
fmt::print("{}\n", YAML::Dump(map));

prints only first document (and misses it's tag)

!<tag:unity3d.com,2011:1>
GameObject:
  m_ObjectHideFlags: 0
  m_CorrespondingSourceObject: {fileID: 0}

Is it possible to get other documents and their tags (see missing &500710038956159860)? Or is there other library capable of doing this?

1

There are 1 answers

2
flyx On

The given input is invalid YAML since a %TAG directive specifies a named handle only for one document, and thus, !u! is an unknown handle in the second document. No conforming YAML implementation should accept this. See spec:

Each document is completely independent from the rest.

Directives belong with the following document.

Concerning yaml-cpp, there is

auto documents = YAML::LoadAll(loaded_map);

which gives you all documents, but there's no corresponding YAML::DumpAll.

Very few libraries support writing %TAG directives, in fact, the only one I know of that can do this is my own, NimYAML.

Very few libraries support preserving the anchors in the input, the only one I know of is go-yaml. Usually, implementations do not emit anchors unless they are referenced, and autogenerate anchor names if they need anchors. They are a presentation detail, after all, and specified to not convey content information.

Generally, YAML is not a format designed to be round-tripping and no implementation is capable of completely reproduce the original style, because the spec clearly states in lots of places that several syntax features are presentation details that should be of no importance to the caller.