OCR fails for a number

matschmann · August 12, 2014, 4:10am

Hello,

I’m trying to recognize a number from an image with aspose ocr. The image can be found here: Dropbox - Error - Simplify your life

I’ running the following code. This is my implementation:

public String extractText(InputStream stream) throws DocumentException {

// Create an instance of OcrEngine

OcrEngine ocrEngine = new OcrEngine();

// Set Resources for OcrEngine

ocrEngine.setResource(BmpDocumentService.class.getResourceAsStream(“/Aspose.OCR.1.9.0.Resources.zip”));

// Set NeedRotationCorrection property to false

ocrEngine.getConfig().setNeedRotationCorrection(false);

// Set image file

ocrEngine.setImage(ImageStream.fromBytes(BinaryUtil.toArray(stream), ImageStreamFormat.Bmp));

// Add language

ocrEngine.getLanguages().addLanguage(Language.load(“english”));

CorrectionFilters value = new CorrectionFilters();

value.add(new RemoveNoiseFilter());

ocrEngine.getConfig().setCorrectionFilters(value);

ocrEngine.setDetectTextOnly(false);

// Perform OCR and get extracted text

if (ocrEngine.process()) {

LOG.debug(“getProbabilitySymbols” + ocrEngine.getProbabilitySymbols());

return ocrEngine.getText().toString();

} else {

return null;

}

And this is my test that calls the implementation:

@Test

public void highQuality() {

String expected = service.extractText(BmpDocumentServiceTest.class.getResourceAsStream(“highQuality.bmp”));

assertThat(expected).isEqualTo(“998.508985.7”);

}

The tests fails with the following output:

org.junit.ComparisonFailure:

Expected :‘998.508985.7’

Actual :‘si J- 8 t ztio- ?’

Are there any parameters to fiddle around to improve OCR-recognition?

Because I only want to parse numbers, it might be possible to define a custom language by implementing the ILanguage-interface, but I don’t know how to do that. An example would be nice.

Greetings,
Rainer

babar.raza · August 12, 2014, 11:05am

Hi Rainer,

Thank you for contacting Aspose support.

We have evaluated your presented scenario while using the latest version of Aspose.OCR for Java 2.0.0 and your provided sample. Unfortunately, we were unable to get the correct results therefore we have logged the problem in our bug tracking system under the ticket OCR-33821 for further investigation & correction purposes. Please spare us little time to properly analyze the problem cause. In the meanwhile, we will keep you posted with updates in this regard.

aspose.notifier · May 2, 2015, 12:04pm

The issues you have found earlier (filed as OCR-33821) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by Aspose Notifier.

awais.hafeez · March 29, 2018, 5:23am

The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.