High quality conversion of pdf page(s) to png, jpeg or tiff format, or render into a raw bitmap array for further processing in R.
pdf_render_page(
pdf,
page = 1,
dpi = 72,
numeric = FALSE,
antialias = TRUE,
opw = "",
upw = ""
)
pdf_convert(
pdf,
format = "png",
pages = NULL,
filenames = NULL,
dpi = 72,
antialias = TRUE,
opw = "",
upw = "",
verbose = TRUE
)
poppler_config()file path or raw vector with pdf data
which page to render
resolution (dots per inch) to render
convert raw output to (0-1) real values
enable antialiasing. Must be "text" or "draw" or TRUE (both)
or FALSE (neither).
owner password
user password
string with output format such as "png" or "jpeg". Must be equal
to one of poppler_config()$supported_image_formats.
vector with one-based page numbers to render. NULL means all pages.
vector of equal length to pages with output filenames. May also be
a format string which is expanded using pages and format respectively.
print some progress info to stdout
Other pdftools:
pdf_ocr_text(),
pdftools,
qpdf
# Rendering should be supported on all platforms now
# convert few pages to png
file.copy(file.path(Sys.getenv("R_DOC_DIR"), "NEWS.pdf"), "news.pdf")
#> [1] TRUE
pdf_convert("news.pdf", pages = 1:3)
#> Converting page 1 to news_1.png... done!
#> Converting page 2 to news_2.png... done!
#> Converting page 3 to news_3.png... done!
#> [1] "news_1.png" "news_2.png" "news_3.png"
# render into raw bitmap
bitmap <- pdf_render_page("news.pdf")
# save to bitmap formats
png::writePNG(bitmap, "page.png")
#> Error in loadNamespace(x): there is no package called ‘png’
webp::write_webp(bitmap, "page.webp")
#> Error in loadNamespace(x): there is no package called ‘webp’
# Higher quality
bitmap <- pdf_render_page("news.pdf", page = 1, dpi = 300)
png::writePNG(bitmap, "page.png")
#> Error in loadNamespace(x): there is no package called ‘png’
# slightly more efficient
bitmap_raw <- pdf_render_page("news.pdf", numeric = FALSE)
webp::write_webp(bitmap_raw, "page.webp")
#> Error in loadNamespace(x): there is no package called ‘webp’
# Cleanup
unlink(c('news.pdf', 'news_1.png', 'news_2.png', 'news_3.png',
'page.jpeg', 'page.png', 'page.webp'))