Poor quality of recognition

Hello.


I’m working with Aspose OCR for Java. Aspose OCR version is 3.4.0

I’m doing some tests, but the recognition quality is poor.
I’m using Aspose logo and some other JPG/PNG files, but none of them are recognized successfully.

I’m using the following code:

public static void main(String[] args) {
License license = new License();
license.setLicense(“Aspose.Total.Java.lic”);

String imagePath = “path\to\aspose-logo.jpg”;
OcrEngine ocr = new OcrEngine();
ocr.setImage(ImageStream.fromFile(imagePath));

if (ocr.process()) {
System.out.println(ocr.getText());
} else {
System.out.println(“Error reading image”);
}
}
Aspose logo generates the result:

- o

Using PT-BR resource pack makes results even worse. Code used for PT-BR lang is the following:

public static void main(String[] args) {
License license = new License();
license.setLicense(“Aspose.Total.Java.lic”);

String imagePath = “path\to\file.jpg”;
OcrEngine ocr = new OcrEngine();
ocr.setImage(ImageStream.fromFile(imagePath));

ocr.getLanguageContainer().clear();
ocr.getLanguageContainer().addLanguage(LanguageFactory.load(“PT-BR.zip”));

if (ocr.process()) {
System.out.println(ocr.getText());
} else {
System.out.println(“Error reading image”);
}
}

Language pack used is this:
http://www.aspose.com/downloads/ocr-family/net/resources/portuguese-language-resource-file-for-aspose.ocr-for-.net-3.2.0/

Files used for pt-br recognition are portuguese-arial.jpg and portuguese-times.jpg

Other used files follows as attachments to this post.

Any help wold be appreciated.

Hi Paulo,

Thank you for your inquiry and sharing sample images.

We have evaluated the attached images at our end. While testing it was found that the images have very low DPI value i.e. 96. Please note that the current implementation of the Aspose.OCR API works well with images having resolution of at least 300 DPI and the accuracy rate tends to decrease by decreasing the resolution. Your provided images have resolution of 96 DPI therefore it will not be possible to get 100% accuracy if you wish to scan the complete image. On the other hand, if you intend to get some specific contents from a portion of the image, you can use the custom recognition blocks to get better accuracy.

Please note, the above mentioned solution is useful in scenario when you have documents following the similar structure, that is; the contents to be scanned are always on the same location for each image.

Furthermore the issue has already been logged into our system with ID OCR-34250. We are continuously improving recognition quality. Low DPI images will work once issue OCR-34250 is fixed.

Regarding your inquiry related to images with colored background, at the moment, Aspose.OCR has some issues with colorful backgrounds. The issues have already been logged into our issue tracking system with ID OCR-33146 and the priority of this issue has been raised to priority support.

Our developers are working on these issues. We will update you via this forum thread. We are sorry for the inconvenience.

Hope the above information helps. Feel free to contact us in case you have further query or comments.

The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.