pandoc is currently an experimental R package primarily develop to help maintainers of R Markdown ecosystem.
Indeed, the R Markdown ecosystem is highly dependent on Pandoc (https://pandoc.org/) changes and it is designed to be as version independent as possible. R Markdown is best used with the latest Pandoc version but any rmarkdown package version should work with previous version of Pandoc, and new change in Pandoc should not break any rmarkdown features.
This explains the needs for a more focused tooling to:
- Install and manage several Pandoc versions. This is useful for testing versions and comparing between them.
- Call Pandoc’s command directly without the layers added by rmarkdown. This is useful for debugging or quickly iterating and finding where a bug comes from.
- Retrieve information from Pandoc directly. Each version comes with changes and some of them are included into the binary. Being able to retrieve those information and compare between versions is important to help maintain the user exposed tooling.
This package can also be useful to advanced developers that are working around Pandoc through rmarkdown or not.
Installation
Install from CRAN:
install.packages("pandoc")The development version can be install from GitHub with:
# install.packages("pak")
pak::pak("cderv/pandoc")Install pandoc
Main usage is to install latest released version of
Pandoc. This requires the gh package as it
will fetch information from Github and download the bundle from
there.
pandoc_install()
#> i Fetching Pandoc releases info from github...
#> v Pandoc 3.8.2.1 already installed.
#> Use 'force = TRUE' to overwrite.If a specific older Pandoc version is needed (e.g for testing differences between version), a version can be specified.
pandoc_install("2.11.4")
#> v Pandoc 2.11.4 already installed.
#> Use 'force = TRUE' to overwrite.Information fetched from Github are cached for the duration of the session.
Sometimes, the dev version of Pandoc is required. Pandoc’s team is building a binary every day called nightly in there CI.
# install the nightly version (overwrites previous one)
pandoc_install_nightly() # or pandoc_install("nightly")All those versions can live together and are installed in an isolated directory in user’s data folders.
# Which version are currently installed ?
pandoc_installed_versions()
# Which is the latest version installed (nightly excluded)?
pandoc_installed_latest()
# Is a specific version installed ?
pandoc_is_installed("2.11.4")
pandoc_is_installed("2.7.3")Downloaded bundles are also cached to speed up further installation. This is useful into tests to quickly install and uninstall a pandoc version for a specific test.
pandoc_install("2.7.3")
pandoc_uninstall("2.7.3")
pandoc_install("2.7.3")To quickly install the last available release, run
pandoc_update() (alias of
pandoc_install()which already default to latest
version).
Find where a pandoc binary is located
For any version installed with this package,
pandoc_locate() will return the folder where it was
installed.
pandoc_locate("2.11.4")
pandoc_locate("nightly")For example purposes in this vignette, the path above is in a temp
directory. Correct location is in user’s data directory computed with
rappdirs::user_data_dir() (e.g on Windows
C:/Users/chris/AppData/Local/r-pandoc/r-pandoc)
To get the path to a pandoc binary, pandoc_bin() can be
used
pandoc_bin("2.11.4")
pandoc_bin("nightly")This function also brings support for external pandoc version, like
- the one shipped with RStudio IDE (
version = "rstudio"):-
pandoc::pandoc_bin("rstudio")or aliaspandoc::pandoc_rstudio_bin()
-
- the one set by default on the system (in PATH)
(
version = "system"):-
pandoc::pandoc_bin("system")or aliaspandoc::pandoc_system_bin()
-
Activate a Pandoc version
As multiple versions can be installed, a default active pandoc
version will be used with any of the function.
(version = "default").
A specific version can be made active using
pandoc_activate()
# Default to latest version installed
pandoc_activate()
pandoc_locate()
pandoc_bin()
# Activate specific version
pandoc_activate("2.11.4")
pandoc_locate()
pandoc_bin()
# including nightly
pandoc_activate("nightly")
pandoc_locate()
pandoc_bin()
# Activate system version
pandoc_activate("system")
pandoc_bin()A default active version will be set when the package is loaded (i.e
using onLoad) following this search order:
- Latest version install by this package (i.e
pandoc_installed_latest()) - Version shipped with RStudio IDE (found when run inside RStudio IDE)
- pandoc binary found in system PATH (i.e
Sys.which("pandoc"))
pandoc_is_active() allows to easily know if a specific
version is active or not.
pandoc_is_active("system")
pandoc_is_active("2.7.3")
pandoc_activate("2.7.3")
pandoc_is_active("2.7.3")Working with rmarkdown functions related to Pandoc
By default, if rmarkdown is installed,
pandoc_activate() will also set the version active for all
rmarkdown functions (using
rmarkdown::find_pandoc()). This allows to use this package
easily in order to test rmarkdown with different
version of Pandoc.
pandoc_activate("2.7.3")
rmarkdown::pandoc_available()
rmarkdown::pandoc_version()
rmarkdown::find_pandoc()These calls are equivalent:
pandoc_activate("2.7.3", rmarkdown = TRUE)
rmarkdown::find_pandoc(cache = FALSE, dir = pandoc::pandoc_locate("2.7.3"))If setting the default Pandoc version for rmarkdown
is not desired, just run with rmarkdown = FALSE
pandoc::pandoc_activate("2.11.4", rmarkdown = FALSE)
rmarkdown::pandoc_version()During testing, it also interesting to run a specific code with a
specific version. with_pandoc_version() or
local_pandoc_version() allows by running
pandoc_activate() for the expression only (helper like withr).
# with pandoc package functions
with_pandoc_version("2.11.4", {
pandoc::pandoc_version()
})
# with rmarkdown package functions
rmarkdown::pandoc_version()
# It will also activate version for rmarkdown
with_pandoc_version("2.11.4", {
rmarkdown::pandoc_version()
})
# rmarkdown = FALSE can be set if not desired
with_pandoc_version("2.11.4", rmarkdown = FALSE, {
rmarkdown::pandoc_version()
})Default behavior for local_pandoc_version() and
with_pandoc_version() is determined by option
pandoc.activate_rmarkdown.
Check if a pandoc version is available
Is a pandoc version available to use (i.e a version is active), and if so what is the full path ?
if (pandoc_available()) pandoc_bin()Is the pandoc activated meeting some requirements ?
# Is the active version above 2.10.1 ?
pandoc_available(min = "2.10.1")
# Is the active version below 2.11 ?
pandoc_available(max = "2.11")
# Is the active version between 2.10.1 and 2.11, both side include ?
pandoc_available(min = "2.10.1", max = "2.11")Pandoc version can also easily be retrieved, including for external binaries
# Get version from current active one
pandoc_version()
# Get version for a specific version
pandoc_version("nightly")
# Get version for a specific version
pandoc_version("system") # equivalent to pandoc_system_version()Run Pandoc CLI from R
Low level call to Pandoc
pandoc_run() is the function to call pandoc binary with
some arguments. By default, it will use the active version
(version = "default", see
?pandoc_activate)
pandoc_run("--version")equivalent to calling
with the correct binary.
Using the version= argument allows to run a specific
version
pandoc_run("--version", version = "system")will execute the pandoc command with pandoc binary on PATH.
Convert a document
This function is highly experimental and probability of API change is high.
Main usage of Pandoc is to convert a document. The
pandoc::pandoc_convert() is currently a thinner wrapper
than rmarkdown::pandoc_convert(). Both allow to convert a
file but the former also allow to convert from text and not just a
file.
# convert from text directly
pandoc_convert(text = "# A header", to = "html")
pandoc_convert(text = "# A header", to = "html", version = "system")
# convert from file
tmp <- tempfile(fileext = ".md")
writeLines("**bold** word!", tmp)
pandoc_convert(tmp, to = "html")
# write to file
out <- tempfile(fileext = ".html")
outfile <- pandoc_convert(tmp, to = "html", output = out, standalone = TRUE, version = "system")
readLines(outfile, n = 5)Various Wrapper functions around pandoc CLI
All other included functions to run pandoc are wrapping
pandoc_run() with some command flags from Pandoc MANUAL. Each of these
functions can take the version= argument to run with a
specific version of Pandoc instead of the current activated one.
Some of those functions can only be used with specific pandoc versions and an error will be thrown if the version requirement is not met.
List supported extensions for a format
pandoc_list_extensions()
pandoc_list_extensions(format = "gfm")
pandoc_list_extensions(format = "html", version = "nightly")List available input or output formats
pandoc_list_formats("input")
pandoc_list_formats("output")
pandoc_list_formats("output", version = "nightly")Export a data file
outfile <- pandoc_export_data_file(file = "styles.html")
outfile
readLines(outfile, n = 5)Export a highlight style JSON file
outfile <- pandoc_export_highlight_theme(style = "zenburn")
outfile
readLines(outfile, n = 5)Export a DOCX or PTTX reference doc
ref_docx <- pandoc_export_reference_doc(type = "docx")
ref_docx
ref_pptx <- pandoc_export_reference_doc(type = "pptx")
ref_pptxExport a template for a format
pandoc_export_template(format = "jira")
outfile <- pandoc_export_template(format = "latex", output = "default.latex")
outfile
readLines(outfile, n = 5)Helpers to easily browse Pandoc’s online resources
pandoc_browse_*() helpers are included to quickly open
an online document like the Pandoc MANUAL
(pandoc_browse_manual()) or a documentation for an
extensions (pandoc_browse_extension("smart")). See reference
doc for more.