Hello,
I’m trying to recognize a number from an image with aspose ocr. The image can be found here: Dropbox - Error - Simplify your life
I’ running the following code. This is my implementation:
public String extractText(InputStream stream) throws DocumentException {
// Create an instance of OcrEngine
OcrEngine ocrEngine = new OcrEngine();
// Set Resources for OcrEngine
ocrEngine.setResource(BmpDocumentService.class.getResourceAsStream(“/Aspose.OCR.1.9.0.Resources.zip”));
// Set NeedRotationCorrection property to false
ocrEngine.getConfig().setNeedRotationCorrection(false);
// Set image file
ocrEngine.setImage(ImageStream.fromBytes(BinaryUtil.toArray(stream), ImageStreamFormat.Bmp));
// Add language
ocrEngine.getLanguages().addLanguage(Language.load(“english”));
CorrectionFilters value = new CorrectionFilters();
value.add(new RemoveNoiseFilter());
ocrEngine.getConfig().setCorrectionFilters(value);
ocrEngine.setDetectTextOnly(false);
// Perform OCR and get extracted text
if (ocrEngine.process()) {
LOG.debug(“getProbabilitySymbols” + ocrEngine.getProbabilitySymbols());
return ocrEngine.getText().toString();
} else {
return null;
}
}
And this is my test that calls the implementation:
public void highQuality() {
String expected = service.extractText(BmpDocumentServiceTest.class.getResourceAsStream(“highQuality.bmp”));
assertThat(expected).isEqualTo(“998.508985.7”);
}
The tests fails with the following output:
org.junit.ComparisonFailure:
Expected :‘998.508985.7’
Actual :‘si J- 8 t ztio- ?’
Are there any parameters to fiddle around to improve OCR-recognition?
Because I only want to parse numbers, it might be possible to define a custom language by implementing the ILanguage-interface, but I don’t know how to do that. An example would be nice.
Greetings,
Rainer
Rainer