Which controls should I be using

mclinton · July 17, 2021, 1:05am

Our company receives resumes in multiple formats, rtf, doc and PDFs. We get pdfs from 1 resume search site. I need to be able to convert and OCR the image pdfs to a normal word doc as text. I already tried converting and got a word doc, with images of the pdf. What products do I need to get to complete this conversion.

asad.ali · July 18, 2021, 5:24pm

@mclinton

You can use Aspose.OCR in order to extract text from the images. The API also supports exporting the recognition results to .docx, .pdf and .txt files. You can extract images from Word and PDF files using Aspose.Words and Aspose.PDF and perform OCR operation on them using Aspose.OCR. For more information, please check the below documentation articles:

mclinton · July 23, 2021, 8:02pm

Thanks for the response. So, is there a way to keep formatting? not a pure text output, but an word Doc that has the formatting preserved?
mjc

mudassir.fayyaz · July 24, 2021, 1:35pm

@mclinton

Can you please explain the details further so that we may help you further in this regard.