Perform OCR Operation over an Image using Aspose.OCR for .NET

Dear Colleagues

We try to get the ocr running for our application. We would like to get german text out of printscreens. We have seen your help site with the different characters: https://docs.aspose.com/ocr/net/recognition-languages/ and found there also german characters like ö ä ü, …

From the attached “sample.jpg” the ocr text is:

ÖäfüAE-Cb (according to the sample.jpg it should be: öäüAEO

Faraway from the sample.jpg, even it has a good resolution and it is arial with no background.

                //OCR
                string dataDir = @"c:\temp\";
                AsposeOcr api = new AsposeOcr();
                string result = api.RecognizeImage(dataDir + "sample.jpg");
                
                Console.WriteLine(result);
                File.WriteAllText(dataDir + "sample.txt", result);

We are using aspose.ocr 20.4.2.0.

What’s wrong here?

Thanks a lot for your support.
Incite GmbH
Marc Huber

sample.jpg (5.9 KB)

@marchuber

We have been able to reproduce the issue in our environment. Therefore, have logged it as OCRNET-177 in our issue tracking system. We will further look into its details and keep you informed about its rectification status. Please be patient and spare us some time.

We are sorry for the inconvenience.

Dear Asad

Is there any news on that topic?

Thanks.
Marc Huber

@marchuber

We are continuously working over improving the recognition quality in the API. The latest version Aspose.OCR for .NET 20.7 has better recognition performance. Would you kindly try it and let us know about your feedback. Furthermore, we will inform you as soon as the ticket is closed.

I tried with version 20.8 but still very poor recocnition. A had a very clear picture with:

öäüAEO

and this ist recognized as:

ÖäÜAE(

************* Trial Licenses *************

Can you please send that information to your developers. Every scanner software can recognize this 10x better.

See attached my sample image. sample.jpg (5.9 KB)

Kind regards.
Marc Huber

@marchuber

We really apologize for your inconvenience.

We have updated the ticket information as per your provided feedback and will inform you as soon as it is resolved. We greatly appreciate your patience in this matter.

Dear Ali

Do you have any news on that. Can you please test the sample.jpg if the quality is now better?

Thanks a lot.

Kind regards.
Marc Huber

Marc,

I tried similar on my side and in JPG with öäüAEO I got ,y9%.
Font was Calibri and test was with registered version of Aspose.OCR 21.1 OCR component still has issues.

BR,
Oliver

@marchuber, @dr.doc

One of the problems that DSR Model of the API better works with multi-lines pages. So we have overridden method where we can switch off DSR Model use (detect areas = false). This API-method gives a better result for this particular image.

Code example

AsposeOcr api = new AsposeOcr();
var img = @".\sample.jpg";
var res = api.RecognizeImage(img, false);

RESULT:

ÖäÜAEO

Still, we have an anomaly in lower case letters recognized as upper case. We hope we will solve this in next releases. You will surely be notified as soon as the issue is resolved. Please give us some time.

We apologize for the inconvenience.

Thanks a lot for your feedback.

We are glad to hear from you as soon as the quality will be better.

@marchuber

Sure, we will let you know once the ticket is completely resolved.

@marchuber

With Aspose.Ocr 21.1.2 we have got the next result:

öäUAEO

We have used the next code:

AsposeOcr api = new AsposeOcr();
var res = api.RecognizeImage(imgPath, new RecognitionSettings { RecognizeSingleLine = true });
Console.WriteLine(res.RecognitionText);
res.Save("D://res.txt", SaveFormat.Text);

In the current release, we improved our model and got better recognition quality. Also, notice that for a single line better use flag:

RecognizeSingleLine = true

With 21.3v (current) version we have got öäüAEO.
Code

AsposeOcr api = new AsposeOcr();
var res = api.RecognizeImage(imgPath, new RecognitionSettings { RecognizeSingleLine = true });
Console.WriteLine(res.RecognitionText);
res.Save("D://res.txt"); 

Dear Ali

Thanks. But it should be öäüAEO instead of öäUAEO.

Kind regards.
Marc Huber

@marchuber

We have got the same results using 21.3 version of the API as shared in our previous response.