Hello,
We are considering getting the Apose.OCR for Java licence, and have been testing it with a trial (no licence) mode the last few days to get started.
Our firm is based in Croatia and we would be interested in Croatian language OCR support. From testing I have found that setting the language to Srp_hrv leads to special characters (ČćŠĐ…) not being recognized. While setting it to None, we can get the characters but we also get German/Spanish characters recognized as well (there are none in the test jpg).
The test code is below, all other settings are default (Autoskew and Detect areas by default true)
public String performOCROnImage(BufferedImage image, RecognitionSettings settings) {
AsposeOCR api = new AsposeOCR();
// settings.setLanguage(Language.None)
settings.setLanguage(Language.Srp_hrv);
RecognitionResult result;
try {
result = api.RecognizePage(image, settings);
} catch (IOException e) {
throw new RuntimeException(e);
}
return result == null ? null : result.recognitionText;
}
I will attach the sample jpg and our results for cro and none and would be interested in the best settings for Croatian as a good result would go a long way in convincing us to get the licence.
ocr_test_document.jpg (277.1 KB)
CroAndNoneResult.zip (863 Bytes)