Extract text from an image using the tesseract package.
image_ocr(image, language = "eng", HOCR = FALSE, ...)
image_ocr_data(image, language = "eng", ...)magick image object returned by image_read() or image_graph()
passed to tesseract. To install additional languages see instructions in tesseract_download().
if TRUE return results as HOCR xml instead of plain text
additional parameters passed to tesseract
To use this function you need to tesseract first:
Best results are obtained if you set the correct language in tesseract. To install additional languages see instructions in tesseract_download().
# \donttest{
if(require("tesseract")){
img <- image_read("http://jeroen.github.io/images/testocr.png")
image_ocr(img)
image_ocr_data(img)
}
#> Loading required package: tesseract
#> Warning: there is no package called ‘tesseract’
# }