Skip to contents

pandoc is currently an experimental R package primarily develop to help maintainers of R Markdown ecosystem.

Indeed, the R Markdown ecosystem is highly dependent on Pandoc (https://pandoc.org/) changes and it is designed to be as version independent as possible. R Markdown is best used with the latest Pandoc version but any rmarkdown package version should work with previous version of Pandoc, and new change in Pandoc should not break any rmarkdown features.

This explains the needs for a more focused tooling to:

  • Install and manage several Pandoc versions. This is useful for testing versions and comparing between them.
  • Call Pandoc’s command directly without the layers added by rmarkdown. This is useful for debugging or quickly iterating and finding where a bug comes from.
  • Retrieve information from Pandoc directly. Each version comes with changes and some of them are included into the binary. Being able to retrieve those information and compare between versions is important to help maintain the user exposed tooling.

This package can also be useful to advanced developers that are working around Pandoc through rmarkdown or not.

Installation

Install from CRAN:

The development version can be install from GitHub with:

# install.packages("pak")
pak::pak("cderv/pandoc")

Install pandoc

Main usage is to install latest released version of Pandoc. This requires the gh package as it will fetch information from Github and download the bundle from there.

pandoc_install()
#> i Fetching Pandoc releases info from github...
#> v Pandoc 3.8.2.1 already installed.
#>   Use 'force = TRUE' to overwrite.

If a specific older Pandoc version is needed (e.g for testing differences between version), a version can be specified.

pandoc_install("2.11.4")
#> v Pandoc 2.11.4 already installed.
#>   Use 'force = TRUE' to overwrite.

Information fetched from Github are cached for the duration of the session.

Sometimes, the dev version of Pandoc is required. Pandoc’s team is building a binary every day called nightly in there CI.

# install the nightly version (overwrites previous one)
pandoc_install_nightly() # or pandoc_install("nightly")

All those versions can live together and are installed in an isolated directory in user’s data folders.

# Which version are currently installed ?
pandoc_installed_versions()
# Which is the latest version installed (nightly excluded)?
pandoc_installed_latest()
# Is a specific version installed ?
pandoc_is_installed("2.11.4")
pandoc_is_installed("2.7.3")

Downloaded bundles are also cached to speed up further installation. This is useful into tests to quickly install and uninstall a pandoc version for a specific test.

To quickly install the last available release, run pandoc_update() (alias of pandoc_install()which already default to latest version).

Find where a pandoc binary is located

For any version installed with this package, pandoc_locate() will return the folder where it was installed.

pandoc_locate("2.11.4")
pandoc_locate("nightly")

For example purposes in this vignette, the path above is in a temp directory. Correct location is in user’s data directory computed with rappdirs::user_data_dir() (e.g on Windows C:/Users/chris/AppData/Local/r-pandoc/r-pandoc)

To get the path to a pandoc binary, pandoc_bin() can be used

pandoc_bin("2.11.4")
pandoc_bin("nightly")

This function also brings support for external pandoc version, like

Activate a Pandoc version

As multiple versions can be installed, a default active pandoc version will be used with any of the function. (version = "default").

A specific version can be made active using pandoc_activate()

# Default to latest version installed
pandoc_activate()
pandoc_locate()
pandoc_bin()

# Activate specific version
pandoc_activate("2.11.4")
pandoc_locate()
pandoc_bin()

# including nightly
pandoc_activate("nightly")
pandoc_locate()
pandoc_bin()

# Activate system version
pandoc_activate("system")
pandoc_bin()

A default active version will be set when the package is loaded (i.e using onLoad) following this search order:

  • Latest version install by this package (i.e pandoc_installed_latest())
  • Version shipped with RStudio IDE (found when run inside RStudio IDE)
  • pandoc binary found in system PATH (i.e Sys.which("pandoc"))

pandoc_is_active() allows to easily know if a specific version is active or not.

By default, if rmarkdown is installed, pandoc_activate() will also set the version active for all rmarkdown functions (using rmarkdown::find_pandoc()). This allows to use this package easily in order to test rmarkdown with different version of Pandoc.

pandoc_activate("2.7.3")
rmarkdown::pandoc_available()
rmarkdown::pandoc_version()
rmarkdown::find_pandoc()

These calls are equivalent:

pandoc_activate("2.7.3", rmarkdown = TRUE)
rmarkdown::find_pandoc(cache = FALSE, dir = pandoc::pandoc_locate("2.7.3"))

If setting the default Pandoc version for rmarkdown is not desired, just run with rmarkdown = FALSE

pandoc::pandoc_activate("2.11.4", rmarkdown = FALSE)
rmarkdown::pandoc_version()

During testing, it also interesting to run a specific code with a specific version. with_pandoc_version() or local_pandoc_version() allows by running pandoc_activate() for the expression only (helper like withr).

# with pandoc package functions
with_pandoc_version("2.11.4", {
  pandoc::pandoc_version()
})

# with rmarkdown package functions
rmarkdown::pandoc_version()

# It will also activate version for rmarkdown
with_pandoc_version("2.11.4", {
  rmarkdown::pandoc_version()
})

# rmarkdown = FALSE can be set if not desired
with_pandoc_version("2.11.4", rmarkdown = FALSE, {
  rmarkdown::pandoc_version()
})

Default behavior for local_pandoc_version() and with_pandoc_version() is determined by option pandoc.activate_rmarkdown.

Check if a pandoc version is available

Is a pandoc version available to use (i.e a version is active), and if so what is the full path ?

Is the pandoc activated meeting some requirements ?

# Is the active version above 2.10.1 ?
pandoc_available(min = "2.10.1")
# Is the active version below 2.11 ?
pandoc_available(max = "2.11")
# Is the active version between 2.10.1 and 2.11, both side include ?
pandoc_available(min = "2.10.1", max = "2.11")

Pandoc version can also easily be retrieved, including for external binaries

# Get version from current active one
pandoc_version()
# Get version for a specific version
pandoc_version("nightly")
# Get version for a specific version
pandoc_version("system") # equivalent to pandoc_system_version()

Run Pandoc CLI from R

Low level call to Pandoc

pandoc_run() is the function to call pandoc binary with some arguments. By default, it will use the active version (version = "default", see ?pandoc_activate)

pandoc_run("--version")

equivalent to calling

pandoc --version

with the correct binary.

Using the version= argument allows to run a specific version

pandoc_run("--version", version = "system")

will execute the pandoc command with pandoc binary on PATH.

Convert a document

This function is highly experimental and probability of API change is high.

Main usage of Pandoc is to convert a document. The pandoc::pandoc_convert() is currently a thinner wrapper than rmarkdown::pandoc_convert(). Both allow to convert a file but the former also allow to convert from text and not just a file.

# convert from text directly
pandoc_convert(text = "# A header", to = "html")
pandoc_convert(text = "# A header", to = "html", version = "system")

# convert from file
tmp <- tempfile(fileext = ".md")
writeLines("**bold** word!", tmp)
pandoc_convert(tmp, to = "html")
# write to file
out <- tempfile(fileext = ".html")
outfile <- pandoc_convert(tmp, to = "html", output = out, standalone = TRUE, version = "system")
readLines(outfile, n = 5)

Various Wrapper functions around pandoc CLI

All other included functions to run pandoc are wrapping pandoc_run() with some command flags from Pandoc MANUAL. Each of these functions can take the version= argument to run with a specific version of Pandoc instead of the current activated one.

Some of those functions can only be used with specific pandoc versions and an error will be thrown if the version requirement is not met.

List supported extensions for a format

pandoc_list_extensions()
pandoc_list_extensions(format = "gfm")
pandoc_list_extensions(format = "html", version = "nightly")

List available input or output formats

pandoc_list_formats("input")
pandoc_list_formats("output")
pandoc_list_formats("output", version = "nightly")

List available highlight style

List supported highlight language

Export a data file

outfile <- pandoc_export_data_file(file = "styles.html")
outfile
readLines(outfile, n = 5)

Export a highlight style JSON file

outfile <- pandoc_export_highlight_theme(style = "zenburn")
outfile
readLines(outfile, n = 5)

Export a DOCX or PTTX reference doc

ref_docx <- pandoc_export_reference_doc(type = "docx")
ref_docx
ref_pptx <- pandoc_export_reference_doc(type = "pptx")
ref_pptx

Export a template for a format

pandoc_export_template(format = "jira")
outfile <- pandoc_export_template(format = "latex", output = "default.latex")
outfile
readLines(outfile, n = 5)

Helpers to easily browse Pandoc’s online resources

pandoc_browse_*() helpers are included to quickly open an online document like the Pandoc MANUAL (pandoc_browse_manual()) or a documentation for an extensions (pandoc_browse_extension("smart")). See reference doc for more.