Rapidyaml

Latest version: v0.8.0

Safety actively analyzes 707283 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

0.8.0

Breaking changes

- [BREAKING] Fix [480](https://github.com/biojppm/rapidyaml/issues/480) ([PR#489](https://github.com/biojppm/rapidyaml/pull/489)):
- Deserializing an empty quoted string *will not* cause an error.
- Deserializing an empty string *will* cause an error: the empty string is read in as an empty scalar.
- Ensure keys are deserialized using all the rules applying to vals.
- Added `KEYNIL` and `VALNIL` to `NodeType_e`, used by the parser to mark the key or val as empty. This changed the values of the `NodeType_e` enumeration.
- Added `NodeType::key_is_null()` and `NodeType::val_is_null()`.
- [BREAKING] Fix [477](https://github.com/biojppm/rapidyaml/issues/477) ([PR#479](https://github.com/biojppm/rapidyaml/pull/479)): changed `read<std::map>()` to overwrite existing entries. The provided implementations had an inconsistency between `std::map` (which wasn't overwriting) and `std::vector` (which *was* overwriting).


Fixes

- [PR488](https://github.com/biojppm/rapidyaml/pull/488):
- add workarounds for problems with codegen of gcc 11,12,13.
- improve CI coverage of gcc and clang optimization levels.
- [PR496](https://github.com/biojppm/rapidyaml/pull/496) and [c4core PR#148](https://github.com/biojppm/c4core/pull/148): Add CI-proven support for CPU architectures:
- mips, mipsel, mips64, mips64el
- sparc, sparc64
- riscv64
- loongarch64
- Fix [476](https://github.com/biojppm/rapidyaml/issues/476) ([PR#493](https://github.com/biojppm/rapidyaml/pull/493)): add handling of Byte Order Marks.
- [PR492](https://github.com/biojppm/rapidyaml/pull/492): fix emit of explicit keys when indented:
yaml
fixed:
? explicit key
: value
previously:
? explicit key
: value this was not indented

- [PR492](https://github.com/biojppm/rapidyaml/pull/492): fix parser reset for full reuse (`m_doc_empty` was not resetted), which would cause problems under specific scenarios in subsequent reuse.
- [PR485](https://github.com/biojppm/rapidyaml/pull/485): improve the CI workflows (thanks to ingydotnet):
- amazing code reuse and organization, thanks to the use of YamlScript to generate the final workflows
- all optimization levels are now covered for gcc, clang and Visual Studio.
- [PR499](https://github.com/biojppm/rapidyaml/pull/499): fix warnings with `-Wundef`.


Thanks

- ingydotnet
- perlpunk
- Delian0

0.7.2

Fixes

- Fix [464](https://github.com/biojppm/rapidyaml/issues/464): test failures with g++14 -O2 in ppc64le ([PR#467](https://github.com/biojppm/rapidyaml/pull/467))


Thanks

- musicinmybrain

0.7.1

New features

- [PR459](https://github.com/biojppm/rapidyaml/pull/459): Add version functions and macros:
cpp
define RYML_VERSION "0.7.1"
define RYML_VERSION_MAJOR 0
define RYML_VERSION_MINOR 7
define RYML_VERSION_PATCH 1
csubstr version();
int version_major();
int version_minor();
int version_patch();


Fixes

- Fix [455](https://github.com/biojppm/rapidyaml/issues/455): parsing of trailing val-less nested maps when deindented to maps ([PR#460](https://github.com/biojppm/rapidyaml/pull/460))
- Fix filtering of double-quoted keys in block maps ([PR452](https://github.com/biojppm/rapidyaml/pull/452))
- Fix [440](https://github.com/biojppm/rapidyaml/issues/440): some tests failing with gcc -O2 (hypothetically due to undefined behavior)
- This was accomplished by refactoring some internal parser functions; see the comments in [440](https://github.com/biojppm/rapidyaml/issues/440) for further details.
- Also, fix all warnings from `scan-build`.
- Use malloc.h instead of alloca.h on MinGW ([PR447](https://github.com/biojppm/rapidyaml/pull/447))
- Fix [442](https://github.com/biojppm/rapidyaml/issues/442) ([PR#443](https://github.com/biojppm/rapidyaml/pull/443)):
- Ensure leading `+` is accepted when deserializing numbers.
- Ensure numbers are not quoted by fixing the heuristics in `scalar_style_query_plain()` and `scalar_style_choose()`.
- Add quickstart sample for overflow detection (only of integral types).
- Parse engine: cleanup unused macros


Thanks

- marcalff
- toge
- musicinmybrain
- buty4649

0.7.0

Most of the changes are from the giant Parser refactor described below. Before getting to that, some other minor changes first.


Fixes

- [PR431](https://github.com/biojppm/rapidyaml/pull/431) - Emitter: prevent stack overflows when emitting malicious trees by providing a max tree depth for the emit visitor. This was done by adding an `EmitOptions` structure as an argument both to the emitter and to the emit functions, which is then forwarded to the emitter. This `EmitOptions` structure has a max tree depth setting with a default value of 64.
- [PR431](https://github.com/biojppm/rapidyaml/pull/431) - Fix `_RYML_CB_ALLOC()` using `(T)` in parenthesis, making the macro unusable.
- [434](https://github.com/biojppm/rapidyaml/issues/434) - Ensure empty vals are not deserialized ([#PR436](https://github.com/biojppm/rapidyaml/pull/436)).
- [PR433](https://github.com/biojppm/rapidyaml/pull/433):
- Fix some corner cases causing read-after-free in the tree's arena when it is relocated while filtering scalars.
- Improve YAML error conformance - detect YAML-mandated parse errors when:
- directives are misplaced (eg [9MMA](https://matrix.yaml.info/details/9MMA.html), [9HCY](https://matrix.yaml.info/details/9HCY.html), [B63P](https://matrix.yaml.info/details/B63P.html), [EB22](https://matrix.yaml.info/details/EB22.html), [SF5V](https://matrix.yaml.info/details/SF5V.html)).
- comments are misplaced (eg [MUS6/00](https://matrix.yaml.info/details/MUS6:00.html), [9JBA](https://matrix.yaml.info/details/9JBA.html), [SU5Z](https://matrix.yaml.info/details/SU5Z.html))
- a node has both an anchor and an alias (eg [SR86](https://matrix.yaml.info/details/SR86.html), [SU74](https://matrix.yaml.info/details/SU74.html)).
- tags contain [invalid characters](https://yaml.org/spec/1.2.2/#tag-shorthands) `,{}[]` (eg [LHL4](https://matrix.yaml.info/details/LHL4.html), [U99R](https://matrix.yaml.info/details/U99R.html), [WZ62](https://matrix.yaml.info/details/WZ62.html)).


New features

- [PR431](https://github.com/biojppm/rapidyaml/pull/431) - append-emitting to existing containers in the `emitrs_` functions, suggested in [#345](https://github.com/biojppm/rapidyaml/issues/345). This was achieved by adding a `bool append=false` as the last parameter of these functions.
- [PR431](https://github.com/biojppm/rapidyaml/pull/431) - add depth query methods:
cpp
Tree::depth_asc(id_type) const; // O(log(num_tree_nodes)) get the depth of a node ascending (ie, from root to node)
Tree::depth_desc(id_type) const; // O(num_tree_nodes) get the depth of a node descending (ie, from node to deep-most leaf node)
ConstNodeRef::depth_asc() const; // likewise
ConstNodeRef::depth_desc() const;
NodeRef::depth_asc() const;
NodeRef::depth_desc() const;

- [PR432](https://github.com/biojppm/rapidyaml/pull/432) - Added a function to estimate the required tree capacity, based on yaml markup:
cpp
size_t estimate_tree_capacity(csubstr); // estimate number of nodes resulting from yaml



------
All other changes come from [PR414](https://github.com/biojppm/rapidyaml/pull/414).

Parser refactor

The parser was completely refactored ([PR414](https://github.com/biojppm/rapidyaml/pull/414)). This was a large and hard job carried out over several months, but it brings important improvements.

- The new parser is an event-based parser, based on an event dispatcher engine. This engine is templated on event handler, where each event is a function call, which spares branches on the event handler. The parsing code was fully rewritten, and is now much more simple (albeit longer), and much easier to work with and fix.
- YAML standard-conformance was improved significantly. Along with many smaller fixes and additions, (too many to list here), the main changes are the following:
- The parser engine can now successfully parse container keys, emitting all the events in correctly, **but** as before, the ryml tree cannot accomodate these (and this constraint is no longer enforced by the parser, but instead by `EventHandlerTree`). For an example of a handler which can accomodate key containers, see the one which is used for the test suite at `test/test_suite/test_suite_event_handler.hpp`
- Anchor keys can now be terminated with colon (eg, `&anchor: key: val`), as dictated by the standard.
- The parser engine can now be used to create native trees in other programming languages, or in cases where the user *must* have container keys.
- Performance of both parsing and emitting improved significantly; see some figures below.


Strict JSON parser

- A strict JSON parser was added. Use the `parse_json_...()` family of functions to parse json in stricter mode (and faster) than flow-style YAML.


YAML style preserved while parsing

- The YAML style information is now fully preserved through parsing/emitting round trips. This was made possible because the event model of the new parsing engine now incorporates style varieties. So, for example:
- a scalar parsed from a plain/single-quoted/double-quoted/block-literal/block-folded scalar will be emitted always using its original style in the YAML source
- a container parsed in block-style will always be emitted in block-style
- a container parsed in flow-style will always be emitted in flow-style
Because of this, the style of YAML emitted by ryml changes from previous releases.
- Scalar filtering was improved and is now done directly in the source being parsed (which may be in place or in the arena), except in the cases where the scalar expands and does not fit its initial range, in which case the scalar is filtered out of place to the tree's arena.
- Filtering can now be disabled while parsing, to ensure a fully-readonly parse (but this feature is still experimental and somewhat untested, given the scope of the rewrite work).
- The parser now offers methods to filter scalars in place or out of place.
- Style flags were added to `NodeType_e`:
cpp
FLOW_SL ///< mark container with single-line flow style (seqs as '[val1,val2], maps as '{key: val,key2: val2}')
FLOW_ML ///< mark container with multi-line flow style (seqs as '[\n val1,\n val2\n], maps as '{\n key: val,\n key2: val2\n}')
BLOCK ///< mark container with block style (seqs as '- val\n', maps as 'key: val')
KEY_LITERAL ///< mark key scalar as multiline, block literal |
VAL_LITERAL ///< mark val scalar as multiline, block literal |
KEY_FOLDED ///< mark key scalar as multiline, block folded >
VAL_FOLDED ///< mark val scalar as multiline, block folded >
KEY_SQUO ///< mark key scalar as single quoted '
VAL_SQUO ///< mark val scalar as single quoted '
KEY_DQUO ///< mark key scalar as double quoted "
VAL_DQUO ///< mark val scalar as double quoted "
KEY_PLAIN ///< mark key scalar as plain scalar (unquoted, even when multiline)
VAL_PLAIN ///< mark val scalar as plain scalar (unquoted, even when multiline)

- Style predicates were added to `NodeType`, `Tree`, `ConstNodeRef` and `NodeRef`:
cpp
bool is_container_styled() const;
bool is_block() const
bool is_flow_sl() const;
bool is_flow_ml() const;
bool is_flow() const;

bool is_key_styled() const;
bool is_val_styled() const;
bool is_key_literal() const;
bool is_val_literal() const;
bool is_key_folded() const;
bool is_val_folded() const;
bool is_key_squo() const;
bool is_val_squo() const;
bool is_key_dquo() const;
bool is_val_dquo() const;
bool is_key_plain() const;
bool is_val_plain() const;

- Style modifiers were also added:
cpp
void set_container_style(NodeType_e style);
void set_key_style(NodeType_e style);
void set_val_style(NodeType_e style);

- Emit helper predicates were added, and are used when an emitted node was built programatically without style flags:
cpp
/** choose a YAML emitting style based on the scalar's contents */
NodeType_e scalar_style_choose(csubstr scalar) noexcept;
/** query whether a scalar can be encoded using single quotes.
* It may not be possible, notably when there is leading
* whitespace after a newline. */
bool scalar_style_query_squo(csubstr s) noexcept;
/** query whether a scalar can be encoded using plain style (no
* quotes, not a literal/folded block scalar). */
bool scalar_style_query_plain(csubstr s) noexcept;


Breaking changes

As a result of the refactor, there are some limited changes with impact in client code. Even though this was a large refactor, effort was directed at keeping maximal backwards compatibility, and the changes are not wide. But they still exist:

- The existing `parse_...()` methods in the `Parser` class were all removed. Use the corresponding `parse_...(Parser*, ...)` function from the header [`c4/yml/parse.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/parse.hpp).
- When instantiated by the user, the parser now needs to receive a `EventHandlerTree` object, which is responsible for building the tree. Although fully functional and tested, the structure of this class is still somewhat experimental and is still likely to change. There is an alternative event handler implementation responsible for producing the events for the YAML test suite in `test/test_suite/test_suite_event_handler.hpp`.
- The declaration and definition of `NodeType` was moved to a separate header file `c4/yml/node_type.hpp` (previously it was in `c4/yml/tree.hpp`).
- Some of the node type flags were removed, and several flags (and combination flags) were added.
- Most of the existing flags are kept, as well as their meaning.
- `KEYQUO` and `VALQUO` are now masks of the several style flags for quoted scalars. In general, however, client code using these flags and `.is_val_quoted()` or `.is_key_quoted()` is not likely to require any changes.


New type for node IDs

A type `id_type` was added to signify the integer type for the node id, defaulting to the backwards-compatible `size_t` which was previously used in the tree. In the future, this type is likely to change, *and probably to a signed type*, so client code is encouraged to always use `id_type` instead of the `size_t`, and specifically not to rely on the signedness of this type.


Reference resolver is now exposed

The reference (ie, alias) resolver object is now exposed in
[`c4/yml/reference_resolver.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/reference_resolver.hpp). Previously this object was temporarily instantiated in `Tree::resolve()`. Exposing it now enables the user to reuse this object through different calls, saving a potential allocation on every call.


Tag utilities

Tag utilities were moved to the new header [`c4/yml/tag.hpp`](https://github.com/biojppm/rapidyaml/blob/master/src/c4/yml/tag.hpp). The types `Tree::tag_directive_const_iterator` and `Tree::TagDirectiveProxy` were deprecated. Fixed also an unitialization problem with `Tree::m_tag_directives`.


Performance improvements

To compare performance before and after this changeset, the benchmark runs were run (in the same PC), and the results were collected into these two files:
- [results before newparser](https://github.com/biojppm/rapidyaml/blob/master/bm/results/results_before_newparser.md)
- [results after newparser](https://github.com/biojppm/rapidyaml/blob/master/bm/results/results_after_newparser.md)
- (suggestion: compare these files in a diff viewer)

There are a lot of results in these files, and many insights can be obtained by browsing them; too many to list here. Below we show only some selected results.


Parsing
Here are some figures for parsing performance, for `bm_ryml_inplace_reuse` (name before) / `bm_ryml_yaml_inplace_reuse` (name after):

| case | B/s before newparser | B/s after newparser | improv % |
|------|------------|-----------|--------|
| [PARSE/appveyor.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/appveyor.yml) | 168.628Mi/s | 232.017Mi/s | ~+40% |
| [PARSE/compile_commands.json](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/compile_commands.yml) | 630.17Mi/s | 609.877Mi/s | ~-3% |
| [PARSE/travis.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/travis.yml) | 193.674Mi/s | 271.598Mi/s | ~+50% |
| [PARSE/scalar_dquot_multiline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_multiline.yml) | 224.796Mi/s | 187.335Mi/s | ~-10% |
| [PARSE/scalar_dquot_singleline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_singleline.yml) | 339.889Mi/s | 388.924Mi/s | ~-16% |

Some conclusions:
- parse performance improved by ~30%-50% for YAML without filtering-heavy parsing.
- parse performance *decreased* by ~10%-15% for YAML with filtering-heavy parsing. There is still some scope for improvement in the parsing code, so this cost may hopefully be minimized in the future.


Emitting

Here are some figures emitting performance improvements retrieved from these files, for `bm_ryml_str_reserve` (name before) / `bm_ryml_yaml_str_reserve` (name after):

| case | B/s before newparser | B/s after newparser |
|------|------------|-----------|
| [EMIT/appveyor.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/appveyor.yml) | 311.718Mi/s | 1018.44Mi/s |
| [EMIT/compile_commands.json](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/compile_commands.yml) | 434.206Mi/s | 771.682Mi/s |
| [EMIT/travis.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/travis.yml) | 333.322Mi/s | 1.41597Gi/s |
| [EMIT/scalar_dquot_multiline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_multiline.yml) | 868.6Mi/s | 692.564Mi/s |
| [EMIT/scalar_dquot_singleline.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/scalar_dquot_singleline.yml) | 336.98Mi/s | 638.368Mi/s |
| [EMIT/style_seqs_flow_outer1000_inner100.yml](https://github.com/biojppm/rapidyaml/blob/master/bm/cases/style_seqs_flow_outer1000_inner100.yml) | 136.826Mi/s | 279.487Mi/s |

Emit performance improved everywhere by over 1.5x and as much as 3x-4x for YAML without filtering-heavy parsing.

0.6.0

Add API documentation

- [PR423](https://github.com/biojppm/rapidyaml/pull/423): **add Doxygen-based API documentation, now hosted in [https://rapidyaml.readthedocs.io/](https://rapidyaml.readthedocs.io/)!**
- It uses the base doxygen docs, as I couldn't get doxyrest or breathe or exhale to produce anything meaningful using the doxygen groups already defined in the source code.


Error handling

Fix major error handling problem reported in [389](https://github.com/biojppm/rapidyaml/issues/389) ([PR#411](https://github.com/biojppm/rapidyaml/pull/411)):

- The `NodeRef` and `ConstNodeRef` classes are now conditional noexcept using `RYML_NOEXCEPT`, which evaluates either to nothing when assertions are enabled, and to `noexcept` otherwise. The problem was that these classes had many methods explicitly marked `noexcept`, but were doing assertions which could throw exceptions, causing an abort instead of a throw whenever the assertion called an exception-throwing error callback.
- This problem was compounded by assertions being enabled in every build type -- despite the intention to have them only in debug builds. There was a problem in the preprocessor code to enable assertions which led to assertions being enabled in release builds even when `RYML_USE_ASSERT` was defined to 0. Thanks to jdrouhard for reporting this.
- Although the code is and was extensively tested, the testing was addressing mostly the happy path. Tests were added to ensure that the error behavior is as intended.
- Together with this changeset, a major revision was carried out of the asserting/checking status of each function in the node classes. In most cases, assertions were added to functions that were missing them. So **beware** - some user code that was invalid will now assert or error out. Also, assertions and checks are now directed as much as possible to the callbacks of the closest scope: ie, if a tree has custom callbacks, errors within the tree class should go through those callbacks.
- Also, the intended assertion behavior is now in place: *no assertions in release builds*. **Beware** as well - user code which was relying on this will now silently succeed and return garbage in release builds. See the next points, which may help.
- Added new methods to the `NodeRef`/`ConstNodeRef` classes:
c++
/** Distinguish between a valid seed vs a valid non-seed ref. */
bool readable() const { return valid() && !is_seed(); }

/** Get a child by name, with error checking; complexity is
* O(num_children).
*
* Behaves as operator[](csubstr) const, but always raises an
* error (even when RYML_USE_ASSERT is set to false) when the
* returned node does not exist, or when this node is not
* readable, or when it is not a map. This behaviour is similar to
* std::vector::at(), but the error consists in calling the error
* callback instead of directly raising an exception. */
ConstNodeRef at(csubstr key) const;
/** Likewise, but return a seed node when the key is not found */
NodeRef at(csubstr key);

/** Get a child by position, with error checking; complexity is
* O(pos).
*
* Behaves as operator[](size_t) const, but always raises an error
* (even when RYML_USE_ASSERT is set to false) when the returned
* node does not exist, or when this node is not readable, or when
* it is not a container. This behaviour is similar to
* std::vector::at(), but the error consists in calling the error
* callback instead of directly raising an exception. */
ConstNodeRef at(size_t pos) const;
/** Likewise, but return a seed node when pos is not found */
NodeRef at(csubstr key);

- The state for `NodeRef` was refined, and now there are three mutually exclusive states (and class predicates) for an object of this class:
- `.invalid()` when the object was not initialized to any node
- `.readable()` when the object points at an existing tree+node
- `.is_seed()` when the object points at an hypotethic tree+node
- The previous state `.valid()` was deprecated: its semantics were confusing as it actually could be any of `.readable()` or `.is_seed()`
- Deprecated also the following methods for `NodeRef`/`ConstNodeRef`:
c++
RYML_DEPRECATED() bool operator== (std::nullptr_t) const;
RYML_DEPRECATED() bool operator!= (std::nullptr_t) const;
RYML_DEPRECATED() bool operator== (csubstr val) const;
RYML_DEPRECATED() bool operator!= (csubstr val) const;

- Added macros and respective cmake options to control error handling:
- `RYML_USE_ASSERT` - enable assertions regardless of build type. This is disabled by default. This macro was already defined; the current PR adds the cmake option.
- `RYML_DEFAULT_CALLBACK_USES_EXCEPTIONS` - make the default error handler provided by ryml throw exceptions instead of calling `std::abort()`. This is disabled by default.
- Also, `RYML_DEBUG_BREAK()` is now enabled only if `RYML_DBG` is defined, as reported in [362](https://github.com/biojppm/rapidyaml/issues/362).
- As part of [PR423](https://github.com/biojppm/rapidyaml/pull/423), to improve linters and codegen:
- annotate the error handlers with `[[noreturn]]`/`C4_NORETURN`
- annotate some error sites with `C4_UNREACHABLE_AFTER_ERR()`


More fixes

- `Tree::arena() const` was returning a `substr`; this was an error. This function was changed to:

csubstr Tree::arena() const;
substr Tree::arena();

- Fix [390](https://github.com/biojppm/rapidyaml/issues/390) - `csubstr::first_real_span()` failed on scientific numbers with one digit in the exponent ([PR#415](https://github.com/biojppm/rapidyaml/pull/415)).
- Fix [361](https://github.com/biojppm/rapidyaml/issues/361) - parse error on map scalars containing `:` and starting on the next line:
yaml
---
failed to parse:
description:
foo:bar
---
but this was ok:
description: foo:bar

- [PR368](https://github.com/biojppm/rapidyaml/pull/368) - fix pedantic compiler warnings.
- Fix [373](https://github.com/biojppm/rapidyaml/issues/373) - false parse error with empty quoted keys in block-style map ([PR#374](https://github.com/biojppm/rapidyaml/pull/374)).
- Fix [356](https://github.com/biojppm/rapidyaml/issues/356) - fix overzealous check in `emit_as()`. An id may be larger than the tree's size, eg when nodes were removed. ([PR#357](https://github.com/biojppm/rapidyaml/pull/357)).
- Fix [417](https://github.com/biojppm/rapidyaml/issues/417) - add quickstart example explaining how to avoid precision loss while serializing floats ([PR#420](https://github.com/biojppm/rapidyaml/pull/420)).
- Fix [380](https://github.com/biojppm/rapidyaml/issues/380) - Debug visualizer .natvis file for Visual Studio was missing `ConstNodeRef` ([PR#383](https://github.com/biojppm/rapidyaml/issues/383)).
- FR [403](https://github.com/biojppm/rapidyaml/issues/403) - install is now optional when using cmake. The relevant option is `RYML_INSTALL`.


Python

- Fix [428](https://github.com/biojppm/rapidyaml/issues/428)/[#412](https://github.com/biojppm/rapidyaml/discussions/412) - Parse errors now throw `RuntimeError` instead of aborting.



Thanks

- Neko-Box-Coder
- jdrouhard
- dmachaj

0.5.0

Not secure
Breaking changes

- Make the node API const-correct ([PR267](https://github.com/biojppm/rapidyaml/pull/267)): added `ConstNodeRef` to hold a constant reference to a node. As the name implies, a `ConstNodeRef` object cannot be used in any tree-mutating operation. It is also smaller than the existing `NodeRef`, and faster because it does not need to check its own validity on every access. As a result of this change, there are now some constraints when obtaining a ref from a tree, and existing code is likely to break in this type of situation:
c++
const Tree const_tree = ...;
NodeRef nr = const_tree.rootref(); // ERROR (was ok): cannot obtain a mutating NodeRef from a const Tree
ConstNodeRef cnr = const_tree.rootref(); // ok

Tree tree = ...;
NodeRef nr = tree.rootref(); // ok
ConstNodeRef cnr = tree.rootref(); // ok (implicit conversion from NodeRef to ConstNodeRef)
// to obtain a ConstNodeRef from a mutable Tree
// while avoiding implicit conversion, use the `c`
// prefix:
ConstNodeRef cnr = tree.crootref();
// likewise for tree.ref() and tree.cref().

nr = cnr; // ERROR: cannot obtain NodeRef from ConstNodeRef
cnr = nr; // ok

The use of `ConstNodeRef` also needs to be propagated through client code. One such place is when deserializing types:
c++
// needs to be changed from:
template<class T> bool read(ryml::NodeRef const& n, T *var);
// ... to:
template<class T> bool read(ryml::ConstNodeRef const& n, T *var);

- The initial version of `ConstNodeRef/NodeRef` had the problem that const methods in the CRTP base did not participate in overload resolution ([294](https://github.com/biojppm/rapidyaml/issues/294)), preventing calls from `const NodeRef` objects. This was fixed by moving non-const methods to the CRTP base and disabling them with SFINAE ([PR#295](https://github.com/biojppm/rapidyaml/pull/295)).
- Also added disambiguation iteration methods: `.cbegin()`, `.cend()`, `.cchildren()`, `.csiblings()` ([PR295](https://github.com/biojppm/rapidyaml/pull/295)).
- Deprecate `emit()` and `emitrs()` ([120](https://github.com/biojppm/rapidyaml/issues/120), [PR#303](https://github.com/biojppm/rapidyaml/pull/303)): use `emit_yaml()` and `emitrs_yaml()` instead. This was done to improve compatibility with Qt, which leaks a macro named `emit`. For more information, see [#120](https://github.com/biojppm/rapidyaml/issues/120).
- In the Python API:
- Deprecate `emit()`, add `emit_yaml()` and `emit_json()`.
- Deprecate `compute_emit_length()`, add `compute_emit_yaml_length()` and `compute_emit_json_length()`.
- Deprecate `emit_in_place()`, add `emit_yaml_in_place()` and `emit_json_in_place()`.
- Calling the deprecated functions will now trigger a warning.
- Location querying is no longer done lazily ([260](https://github.com/biojppm/rapidyaml/issues/260), [PR#307](https://github.com/biojppm/rapidyaml/pull/307)). It now requires explicit opt-in when instantiating the parser. With this change, the accelerator structure for location querying is now built when parsing:
c++
Parser parser(ParserOptions().locations(true));
// now parsing also builds location lookup:
Tree t = parser.parse_in_arena("myfile.yml", "foo: bar");
assert(parser.location(t["foo"]).line == 0u);

- Locations are disabled by default:
c++
Parser parser;
assert(parser.options().locations() == false);

- Deprecate `Tree::arena_pos()`: use `Tree::arena_size()` instead ([PR290](https://github.com/biojppm/rapidyaml/pull/290)).
- Deprecate pointless `has_siblings()`: use `Tree::has_other_siblings()` instead ([PR330](https://github.com/biojppm/rapidyaml/pull/330).


Performance improvements

- Improve performance of integer serialization and deserialization (in [c4core](https://github.com/biojppm/c4core)). Eg, on Linux/g++11.2, with integral types:
- `c4::to_chars()` can be expected to be roughly...
- ~40% to 2x faster than `std::to_chars()`
- ~10x-30x faster than `sprintf()`
- ~50x-100x faster than a naive `stringstream::operator<<()` followed by `stringstream::str()`
- `c4::from_chars()` can be expected to be roughly...
- ~10%-30% faster than `std::from_chars()`
- ~10x faster than `scanf()`
- ~30x-50x faster than a naive `stringstream::str()` followed by `stringstream::operator>>()`
For more details, see [the changelog for c4core 0.1.10](https://github.com/biojppm/c4core/releases/tag/v0.1.10).
- Fix [289](https://github.com/biojppm/rapidyaml/issues/289) and [#331](https://github.com/biojppm/rapidyaml/issues/331) - parsing of single-line flow-style sequences had quadratic complexity, causing long parse times in ultra long lines [PR#293](https://github.com/biojppm/rapidyaml/pull/293)/[PR#332](https://github.com/biojppm/rapidyaml/pull/332).
- This was due to scanning for the token `: ` before scanning for `,` or `]`, which caused line-length scans on every scalar scan. Changing the order of the checks was enough to address the quadratic complexity, and the parse times for flow-style are now in line with block-style.
- As part of this changeset, a significant number of runtime branches was eliminated by separating `Parser::_scan_scalar()` into several different `{seq,map}x{block,flow}` functions specific for each context. Expect some improvement in parse times.
- Also, on Debug builds (or assertion-enabled builds) there was a paranoid assertion calling `Tree::has_child()` in `Tree::insert_child()` that caused quadratic behavior because the assertion had linear complexity. It was replaced with a somewhat equivalent O(1) assertion.
- Now the byte throughput is independent of line size for styles and containers. This can be seen in the table below, which shows parse troughputs in MB/s of 1000 containers of different styles and sizes (flow containers are in a single line):

| Container | Style | 10elms | 100elms | 1000elms |
|-----------|-------|-------------|--------------|---------------|
| 1000 Maps | block | 50.8MB/s | 57.8MB/s | 63.9MB/s |
| 1000 Maps | flow | 58.2MB/s | 65.9MB/s | 74.5MB/s |
| 1000 Seqs | block | 55.7MB/s | 59.2MB/s | 60.0MB/s |
| 1000 Seqs | flow | 52.8MB/s | 55.6MB/s | 54.5MB/s |
- Fix [329](https://github.com/biojppm/rapidyaml/issues/329): complexity of `has_sibling()` and `has_child()` is now O(1), previously was linear ([PR#330](https://github.com/biojppm/rapidyaml/pull/330)).


Fixes

- Fix [233](https://github.com/biojppm/rapidyaml/issues/233) - accept leading colon in the first key of a flow map (`UNK` node) [PR#234](https://github.com/biojppm/rapidyaml/pull/234):
yaml
:foo: parse error on the leading colon
:bar: a parse error on the leading colon
:barbar: b was ok
:barbarbar: c was ok
foo: was ok
bar: a was ok
:barbar: b was ok
:barbarbar: c was ol

- Fix [253](https://github.com/biojppm/rapidyaml/issues/253): double-quoted emitter should encode carriage-return `\r` to preserve roundtrip equivalence:
yaml
Tree tree;
NodeRef root = tree.rootref();
root |= MAP;
root["s"] = "t\rt";
root["s"] |= _WIP_VAL_DQUO;
std::string s = emitrs<std::string>(tree);
EXPECT_EQ(s, "s: \"t\\rt\"\n");
Tree tree2 = parse_in_arena(to_csubstr(s));
EXPECT_EQ(tree2["s"].val(), tree["s"].val());

- Fix parsing of empty block folded+literal scalars when they are the last child of a container (part of [PR264](https://github.com/biojppm/rapidyaml/issues/264)):
yaml
seq:
- ""
- ''
- >
- | error, the resulting val included all the YAML from the next node
seq2:
- ""
- ''
- |
- > error, the resulting val included all the YAML from the next node
map:
a: ""
b: ''
c: >
d: | error, the resulting val included all the YAML from the next node
map2:
a: ""
b: ''
c: |
d: > error, the resulting val included all the YAML from the next node
lastly: the last

- Fix [274](https://github.com/biojppm/rapidyaml/issues/274) ([PR#296](https://github.com/biojppm/rapidyaml/pull/296)): Lists with unindented items and trailing empty values parse incorrectly:
yaml
foo:
- bar
-
baz: qux

was wrongly parsed as
yaml
foo:
- bar
- baz: qux

- Fix [277](https://github.com/biojppm/rapidyaml/issues/277) ([PR#340](https://github.com/biojppm/rapidyaml/pull/340)): merge fails with duplicate keys.
- Fix [337](https://github.com/biojppm/rapidyaml/issues/337) ([PR#338](https://github.com/biojppm/rapidyaml/pull/338)): empty lines in block scalars shall not have tab characters `\t`.
- Fix [268](https://github.com/biojppm/rapidyaml/issues/268) ([PR#339](https://github.com/biojppm/rapidyaml/pull/339)): don't override key type_bits when copying val. This was causing problematic resolution of anchors/references.
- Fix [309](https://github.com/biojppm/rapidyaml/issues/309) ([PR#310](https://github.com/biojppm/rapidyaml/pull/310)): emitted scalars containing `` or `` ` `` should be quoted.
- The quotes should be added only when they lead the scalar. See [320](https://github.com/biojppm/rapidyaml/issues/320) and [PR#334](https://github.com/biojppm/rapidyaml/pull/334).
- Fix [297](https://github.com/biojppm/rapidyaml/issues/297) ([PR#298](https://github.com/biojppm/rapidyaml/pull/298)): JSON emitter should escape control characters.
- Fix [292](https://github.com/biojppm/rapidyaml/issues/292) ([PR#299](https://github.com/biojppm/rapidyaml/pull/299)): JSON emitter should quote version string scalars like `0.1.2`.
- Fix [291](https://github.com/biojppm/rapidyaml/issues/291) ([PR#299](https://github.com/biojppm/rapidyaml/pull/299)): JSON emitter should quote scalars with leading zero, eg `048`.
- Fix [280](https://github.com/biojppm/rapidyaml/issues/280) ([PR#281](https://github.com/biojppm/rapidyaml/pull/281)): deserialization of `std::vector<bool>` failed because its `operator[]` returns a `reference` instead of `value_type`.
- Fix [288](https://github.com/biojppm/rapidyaml/issues/288) ([PR#290](https://github.com/biojppm/rapidyaml/pull/290)): segfault on successive calls to `Tree::_grow_arena()`, caused by using the arena position instead of its length as starting point for the new arena capacity.
- Fix [324](https://github.com/biojppm/rapidyaml/issues/324) ([PR#328](https://github.com/biojppm/rapidyaml/pull/328)): eager assertion prevented moving nodes to the first position in a parent.
- Fix `Tree::_clear_val()`: was clearing key instead ([PR335](https://github.com/biojppm/rapidyaml/pull/335)).
- YAML test suite events emitter: fix emission of inheriting nodes. The events for `{<<: *anchor, foo: bar}` are now correctly emitted as:
yaml
=VAL :<< previously was =ALI <<
=ALI *anchor
=VAL :foo
=VAL :bar

- Fix [246](https://github.com/biojppm/rapidyaml/issues/246): add missing `#define` for the include guard of the amalgamated header.
- Fix [326](https://github.com/biojppm/rapidyaml/issues/326): honor runtime settings for calling debugbreak, add option to disable any calls to debugbreak.
- Fix [cmake8](https://github.com/biojppm/cmake/issues/8): `SOVERSION` missing from shared libraries.


Python

- The Python packages for Windows and MacOSX are causing problems in the CI, and were mostly disabled. The problematic packages are successfully made, but then fail to be imported. This was impossible to reproduce outside of the CI, and they were disabled since they were delaying the release. As a consequence, the Python release will have very limited compiled packages for Windows (only Python 3.6 and 3.7) or MacOSX. Help would be appreciated from those interested in these packages.


Thanks

- NaN-git
- dancingbug

Page 1 of 3

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.