OCR module

We are interested in creating an OCR template for some of our forms. However, many of our forms come in pdf format. Would we need to convert these forms into a different format prior to the OCR process?

Hi Prudential,


Thank you for considering Aspose products, and welcome to Aspose.OCR support forum.

Aspose.OCR APIs require the documents to be in image format (Jpeg, Bmp, Tiff, Png) in order to perform OCR operation on them. Please let us know what platform (.NET, Java) are you interested in so we could provide you the API download link as well as the code snippets for your evaluation.

.Net

Thanks!

Hi Prudential,

Thank you for the confirmation.

Please download the latest version of Aspose.OCR for .NET 1.9.0 and its corresponding resource file, whereas code snippets to perform an OCR operation on an image is available on this technical article. Please note, in evaluation mode, that is; without setting a valid license, Aspose.OCR APIs can extract up to 50 characters from a given image. In order to avoid the evaluation limitations to test the product at its full capacity, we would suggest you to get a 30 days temporary license.

In case you wish to programmatically convert your PDF documents to raster image formats, you may use Aspose,Pdf for .NET API for this purpose. Please check the following articles if they interest you.

Please note, whatever method/tool you choose to export the PDF files, the resultant image should have better resolution (most preferably 300x300) for the maximum accuracy of the OCR results. In case you face any difficulty, please feel free to write back.