How to OCR pdf files?

Hi,
We have latest Aspose.Total API license. We have following doubts in OCR API, other than aspose ocr dll we used all other dlls like pdf,cell,image, barcode. Before going to implement we need to clarify about OCR dll.

  1. Is it support multi language including Chinese Traditional, simplified, French etc, and pdf file mixed of English and Chinese woeds.
    2.Punch hole remove - when ocr file, it find punch hole mark and remove it.
  2. Is it possible to ocr only non ocr pages.
    E.g: if pdf got 10 pages. if pages 3,5,8,10 are already ocr’d and other pages need to ocr,So when ocr using Aspose OCR api, is it skip already ocr’d pages (3,5,8,10)

Regards,
Aravind

@bpanchu,

Thank you for your inquiry. Following are the details:

  1. Aspose.OCR for .NET API currently supports the following languages.
    English
    Spanish
    French
    Portuguese

  2. Punch hole remove: There is no such functionality available in the API.

  3. There is no such functionality available in the API to skip pages. It will depend on your logic and implementation. For performing OCR operation on PDF files, you have to process it page by page. While processing you can implement the logic to skip the page or pages.

Thanks for reply.

Regards,
Aravind

@bpanchu,

You are welcome. Please feel free to contact us in case any query or comments.