Parse input into code chunks, inline code expressions, and text fragments:
crack() is for parsing R Markdown, and sieve() is for R scripts.
crack(input, text = NULL)
sieve(input, text = NULL)A character vector to provide the input file path or text. If
not provided, the text argument must be provided instead. The input
vector will be treated as a file path if it is a single string, and points
to an existing file or has a filename extension. In other cases, the vector
will be treated as the text argument input. To avoid ambiguity, if a
string should be treated as text input when it happens to be an existing
file path or has an extension, wrap it in I(), or simply use the text
argument instead.
A character vector as the text input. By default, it is read from
the input file if provided.
A list of code chunks and text blocks:
Code chunks are of the form list(source, type = "code_chunk", options, comments, ...): source is a character vector of the source code of a
code chunk, options is a list of chunk options, and comments is a
vector of pipe comments.
Text blocks are of the form list(source, type = "text_block", ...). If
the text block does not contain any inline code, source will be a
character string (lines of text concatenated by line breaks), otherwise it
will be a list with members that are either character strings (normal text
fragments) or lists of the form list(source, options, ...) (source is
the inline code, and options contains its options specified inside `{lang, ...}`).
Both code chunks and text blocks have a list member named lines that
stores their starting and ending line numbers in the input.
For R Markdown, a code chunk must start with a fence of the form ```{lang}, where lang is the language name, e.g., r or python. The
body of a code chunk can start with chunk options written in "pipe comments",
e.g., #| eval = TRUE, echo = FALSE (the CSV syntax) or #| eval: true (the
YAML syntax). An inline code fragment is of the form `{lang} source`
embedded in Markdown text.
For R scripts, text blocks are extracted by removing the leading
#' tokens. All other lines are treated as R code, which can optionally be
separated into chunks by consecutive lines of #| comments (chunk options
are written in these comments). If no #' or #| tokens are found in the
script, the script will be divided into chunks that contain smallest
possible complete R expressions.
For simplicity, sieve() does not support inline code expressions.
Text after #' is treated as pure Markdown.
It is a pure coincidence that the function names crack() and sieve()
weakly resemble Carson Sievert's name, but I will consider adding a class
name sievert to the returned value of sieve() if Carson becomes the
president of the United States someday, which may make the value
radioactive and introduce a new programming paradigm named Radioactive
Programming (in case Reactive Programming is no longer fun or cool).
library(litedown)
# parse R Markdown
res = crack(c("```{r}\n1+1\n```", "Hello, `pi` = `{r} pi` and `e` = `{r} exp(1)`!"))
str(res)
#> List of 2
#> $ :List of 6
#> ..$ source : chr "1+1"
#> ..$ type : chr "code_chunk"
#> ..$ lines : int [1:2] 1 3
#> ..$ options :List of 2
#> .. ..$ label : chr "chunk-1"
#> .. ..$ engine: chr "r"
#> ..$ comments : NULL
#> ..$ code_start: int 2
#> $ :List of 3
#> ..$ source:List of 5
#> .. ..$ : chr "Hello, `pi` = "
#> .. ..$ :List of 3
#> .. .. ..$ source : chr "pi"
#> .. .. ..$ pos : int [1:4] 4 16 4 21
#> .. .. ..$ options:List of 1
#> .. .. .. ..$ engine: chr "r"
#> .. ..$ : chr " and `e` = "
#> .. ..$ :List of 3
#> .. .. ..$ source : chr "exp(1)"
#> .. .. ..$ pos : int [1:4] 4 35 4 44
#> .. .. ..$ options:List of 1
#> .. .. .. ..$ engine: chr "r"
#> .. ..$ : chr "!"
#> ..$ type : chr "text_block"
#> ..$ lines : int [1:2] 4 4
# evaluate inline code and combine results with text fragments
txt = lapply(res[[2]]$source, function(x) {
if (is.character(x))
x else eval(parse(text = x$source))
})
paste(unlist(txt), collapse = "")
#> [1] "Hello, `pi` = 3.14159265358979 and `e` = 2.71828182845905!"
# parse R code
res = sieve(c("#' This is _doc_.", "", "#| eval=TRUE", "# this is code", "1 + 1"))
str(res)
#> List of 2
#> $ :List of 3
#> ..$ source: chr "This is _doc_.\n"
#> ..$ type : chr "text_block"
#> ..$ lines : int [1:2] 1 2
#> $ :List of 6
#> ..$ source : chr [1:2] "# this is code" "1 + 1"
#> ..$ options :List of 3
#> .. ..$ eval : logi TRUE
#> .. ..$ engine: chr "r"
#> .. ..$ label : chr "chunk-1"
#> ..$ comments : chr "#| eval=TRUE"
#> ..$ code_start: int 4
#> ..$ type : chr "code_chunk"
#> ..$ lines : int [1:2] 3 5