Hi,
There are loads of acticles on the forums covering this topic but mostly old and none particularly helpful.
We have a simple requirement to take an existing PDF document with no text layer, OCR it and save it back to a PDF document.
We have currently do this by breaking down the PDF to individual image files before using tesseract to OCR and save as a PDF doc… but this is slow, and CPU intensive… we shouldn’t need to do this!
We have code which tranforms the existing PDF into a multipafge TIFF, and we OCR that TIFF with Aspose,OCR… can you outline with sample code how we can convert the TIFF file to PDF end embedding the OCR’d text?
Thanks