The R YAML package implements the libyaml YAML parser and emitter for R.
YAML is a human-readable markup language. With it, you can create easily readable documents that can be consumed by a variety of programming languages.
You can install this package directly from CRAN by running (from within R): install.packages('yaml')
R CMD INSTALL <filename>
The yaml
package provides three functions: yaml.load
, yaml.load_file
and as.yaml
.
yaml.load
is the YAML parsing function. It accepts a YAML document as a string. Here’s a simple example that parses a YAML sequence:
A YAML scalar is the basic building block of YAML documents. Example of a YAML document with one element:
In this case, the scalar “1.2345” is typed as a float
(or numeric) by the parser. yaml.load
would return a numeric vector of length 1 for this document.
A YAML sequence is a list of elements. Here’s an example of a simple YAML sequence:
If you pass a YAML sequence to yaml.load
, a couple of things can happen. If all of the elements in the sequence are uniform, yaml.load
will return a vector of that type (i.e. character, integer, real, or logical). If the elements are not uniform, yaml.load
will return a list of the elements.
A YAML map is a list of paired keys and values, or hash, of elements. Here’s an example of a simple YAML map:
Passing a map to yaml.load
will produce a named list by default. That is, keys are coerced to strings. Since it is possible for the keys of a YAML map to be almost anything (not just strings), you might not want yaml.load
to return a named list. If you want to preserve the data type of keys, you can pass as.named.list = FALSE
to yaml.load
. If as.named.list
is FALSE, yaml.load
will create a keys
attribute for the list it returns instead of coercing the keys into strings.
yaml.load
has the capability to accept custom handler functions. With handlers, you can customize yaml.load
to do almost anything you want. Example of handler usage:
integer.handler <- function(x) { as.integer(x) + 123 }
yaml.load("123", handlers = list(int = integer.handler)) #=> [1] 246
Handlers are passed to yaml.load
through the handlers
argument. The handlers
argument must be a named list of functions, where each name is the YAML type that you want to be handled by your function. The functions you provide must accept one argument and must return an R object.
Handler functions will be passed a string or list, depending on the original type of the object. In the example above, integer.handler
was passed the string “123”.
Custom sequence handlers will be passed a list of objects. You can then convert the list into whatever you want and return it. Example:
Custom map handlers work much in the same way as custom list handlers. A map handler function is passed a named list, or a list with a keys
attribute (depending on the value of as.named.list
). Example:
string <- "
a:
- 1
- 2
b:
- 3
- 4
"
yaml.load(string, handlers = list(map = function(x) { as.data.frame(x) }))
Returns:
yaml.load_file
does the same thing as yaml.load
, except it reads a file from a connection. For example:
This function takes the same arguments as yaml.load
, with the exception that the first argument is a filename or a connection.
The read_yaml
function is a convenience function that works similarly to functions in the readr package. You can use it instead of yaml.load_file
if you prefer.
as.yaml
is used to convert R objects into YAML strings. Example as.yaml
usage:
Output from above example:
You can control the number of spaces used to indent by setting the indent
option. By default, indent
is 2.
For example:
Outputs:
By default, sequences that are within a mapping context are not indented.
For example:
Outputs:
If you want sequences to be indented in this context, set the indent.mapping.sequence
option to TRUE
.
For example:
Outputs:
foo:
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
The column.major
option determines how a data frame is converted into YAML. By default, column.major
is TRUE.
Example of as.yaml
when column.major
is TRUE:
Outputs:
Whereas:
Outputs:
You can specify custom handler functions via the handlers
argument. This argument must be a named list of functions, where the names are R object class names (i.e., ‘numeric’, ‘data.frame’, ‘list’, etc). The function(s) you provide will be passed one argument (the R object) and can return any R object. The returned object will be emitted normally.
Character vectors that have a class of 'verbatim'
will not be quoted in the output YAML document except when the YAML specification requires it. This means that you cannot do anything that would result in an invalid YAML document, but you can emit strings that would otherwise be quoted. This is useful for changing how logical vectors are emitted. For example:
There are times you might need to ensure a string scalar is quoted. Apply a non-null attribute of “quoted” to the string you need quoted and it will come out with double quotes around it.
You can specify YAML tags for R objects by setting the 'tag'
attribute to a character vector of length 1. If you set a tag for a vector, the tag will be applied to the YAML sequence as a whole, unless the vector has only 1 element. If you wish to tag individual elements, you must use a list of 1-length vectors, each with a tag attribute. Likewise, if you set a tag for an object that would be emitted as a YAML mapping (like a data frame or a named list), it will be applied to the mapping as a whole. Tags can be used in conjunction with YAML deserialization functions like yaml.load
via custom handlers, however, if you set an internal tag on an incompatible data type (like !seq 1.0
), errors will occur when you try to deserialize the document.
The write_yaml
function is a convenience function that works similarly to functions in the readr package. It calls as.yaml
and writes the result to a file or a connection.
For more information, run help(package='yaml')
or example('yaml-package')
for some examples.
There is a Makefile
for use with GNU Make to help with development. There are several make
targets for building, debugging, and testing. You can run these by executing make <target-name>
if you have the make
program installed.
Target name | Description |
---|---|
compile |
Compile the source files |
check |
Run CRAN checks |
gct-check |
Run CRAN checks with gctorture |
test |
Run unit tests |
gdb-test |
Run unit tests with gdb |
valgrind-test |
Run unit tests with valgrind |
tarball |
Create tarball suitable for CRAN submission |
all |
Default target, runs compile and test
|
If you’d like to set up a local development and testing environment using Docker, you can follow these instructions:
git clone git@github.com:vubiostat/r-yaml.git
cd r-yaml
docker run -it --name r-yaml --workdir /opt -v$(pwd):/opt r-base:4.2.3 bash
apt-get update
apt-get install -y texlive-latex-base texlive-fonts-extra texlive-latex-recommended texlive-fonts-recommended
Rscript -e 'install.packages("RUnit")'
make check
make test
exit
docker container start -i r-yaml
docker rm r-yaml
The algorithm used whenever there is no YAML tag explicitly provided is located in the implicit.re file. This file is used to create the implicit.c file via the re2c program. If you want to change this algorithm, make your changes in implicit.re
, not implicit.c
. The make
targets will automatically update the C file as needed, but you’ll need to have the re2c
program installed for it to work.