Croatian support settings

tbabi · November 16, 2022, 11:08am

Hello,

We are considering getting the Apose.OCR for Java licence, and have been testing it with a trial (no licence) mode the last few days to get started.

Our firm is based in Croatia and we would be interested in Croatian language OCR support. From testing I have found that setting the language to Srp_hrv leads to special characters (ČćŠĐ…) not being recognized. While setting it to None, we can get the characters but we also get German/Spanish characters recognized as well (there are none in the test jpg).
The test code is below, all other settings are default (Autoskew and Detect areas by default true)

public String performOCROnImage(BufferedImage image, RecognitionSettings settings) {
AsposeOCR api = new AsposeOCR();
// settings.setLanguage(Language.None)
settings.setLanguage(Language.Srp_hrv);
RecognitionResult result;
try {
result = api.RecognizePage(image, settings);
} catch (IOException e) {
throw new RuntimeException(e);
}
return result == null ? null : result.recognitionText;
}

I will attach the sample jpg and our results for cro and none and would be interested in the best settings for Croatian as a good result would go a long way in convincing us to get the licence.

ocr_test_document.jpg (277.1 KB)
CroAndNoneResult.zip (863 Bytes)

asad.ali · November 16, 2022, 8:45pm

@tbabi

We need to investigate this case and for the sake, an investigation ticket as OCRJAVA-296 has been logged in our issue tracking system. We will further look into its details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for the inconvenience.

tbabi · November 17, 2022, 12:47pm

Thank you very much, I look forward to any updates.

asad.ali · December 6, 2022, 9:35pm

@tbabi

The support has been added in 22.11.1 version of the API.