Our company receives resumes in multiple formats, rtf, doc and PDFs. We get pdfs from 1 resume search site. I need to be able to convert and OCR the image pdfs to a normal word doc as text. I already tried converting and got a word doc, with images of the pdf. What products do I need to get to complete this conversion.
You can use Aspose.OCR in order to extract text from the images. The API also supports exporting the recognition results to .docx, .pdf and .txt files. You can extract images from Word and PDF files using Aspose.Words and Aspose.PDF and perform OCR operation on them using Aspose.OCR. For more information, please check the below documentation articles:
Thanks for the response. So, is there a way to keep formatting? not a pure text output, but an word Doc that has the formatting preserved?
Can you please explain the details further so that we may help you further in this regard.