Store Tesseract Output in PDF using R
15:37 29 Aug 2021

I am trying to use the R interface to tesseract to create a PDF file with embedded text. I have seen the previous question tesseract (v3.03) output as PDF but it is about using the command line interface to tesseract. This question is about the R interface. I set the tessedit_create_pdf option to 1, but got no new pdf file. I do not see an option to set the output file. How can I make tesseract create a pdf with embedded text? The code below generates good text in memory, but no PDF file.

library(tesseract)
packageVersion("tesseract")
[1] ‘4.1.1’

eng1P <- tesseract(language = "eng", 
    options = list(tessedit_pageseg_mode = 1,
        tessedit_create_pdf=1))

text0 <- tesseract::ocr("TestImage.png", engine = eng1P)
cat(text0[[1]])

This image can be used for testing.

Test Image

r pdf ocr tesseract