OCR general information

Hi,

I am looking for the API that could do OCR and found that Aspose can provide it. I am interested in Java version. Could you please give some explanation on what is supported by Java version of OCR. What is the input and what is the produced output? How it recognizes similar characters “0 vs o”, “l vs 1”, “l vs I” etc?

I noticed from the products page that versions for .NET and Java have certain differences. So again, I am interested in Java version.

Thanks,
Olga

Hi Olga,


Sorry for the delayed response. I’m working over your inquiry and will update you soon.

Best Regards,

Hi Olga,

Thanks for considering Aspose.OCR for java. Currently It supports BMP file format as input with
English language. It can recognize Arial, Times New
Roman and Tohama fonts with regular/bold/italic font styles. Recognition accuracy of big font sizes i.e. 32pts and
above is 90% and smaller font sizes have less accuracy. Our
development team is working over a major revamp of Aspose.OCR API for
performance improvement, support of smaller font sizes, new fonts and
languages.

In reference to character recognition question, It uses resource file for the purpose. The resource file is a ZIP archive that contains the data necessary to perform OCR (XML etalon file and font collection file). Please check following documentation link for the details.

Please feel free to contact us for
any further assistance.

Best Regards,