OCR Result not completely

HauptlorenzAG · October 18, 2022, 6:17am

First question: is the .RecognitionText not complete in the temporary License?
It gives me some result and then ‘’
4. Zuständige Bauaufsichtsbehörde und z
************* Trial Licenses *************"
Will the recognized text length be the same after buying or is content skipped here?

Second question: my picture is a form. OCR recognition gives me just the headlines, but not the form content (inside rectangles with borders). I do not find any option, which give me also this content.
Can I attach the image privately so that you can have a look?

asad.ali · October 18, 2022, 3:34pm

@HauptlorenzAG

No, temporary license would work just like permanent license. However, would you please make sure that you are setting it correctly before using any API method? Licensing

Moreover, would you please share your sample image for our reference? We will test the scenario in our environment and address it accordingly.

HauptlorenzAG · October 18, 2022, 3:51pm

Ok, then also main headlines are missing in the extracted text.
Can I upload the image here, so that it is not public visible?

asad.ali · October 18, 2022, 8:51pm

@HauptlorenzAG

The files uploaded here will only be accessible by Aspose Staff and the Post Creator (you). You can surely upload your image here. Furthermore, you can also use the option of private message as well to share your file. image.png (11.8 KB)

HauptlorenzAG · October 19, 2022, 6:42am

Here comes the file

2022_10_18_08_08_18_Muster_für_eine_Bescheinigung.png (92.1 KB)

asad.ali · October 19, 2022, 5:32pm

@HauptlorenzAG

We are checking it and will get back to you shortly.

asad.ali · October 19, 2022, 9:10pm

@HauptlorenzAG

Would you please try to use this code snippet and share if it resolves the issue:

Aspose.OCR.AsposeOcr api = new OCR.AsposeOcr();
var result = api.RecognizeImage(dataDir + "2022_10_18_08_08_18_Muster_für_eine_Bescheinigung.png", new Aspose.OCR.RecognitionSettings
{
 DetectAreas = false,
 RecognizeSingleLine = false,
 AutoSkew = true,
 AllowedCharacters = Aspose.OCR.CharactersAllowedType.ALL,
 //AutoContrast = true,
 Language = Aspose.OCR.Language.Eng
});
result.Save(dataDir + "recResults.txt", Aspose.OCR.SaveFormat.Text);

HauptlorenzAG · October 20, 2022, 5:03am

I’ve tried this and the result is slightly better. The important property is “DetectAreas = false”. I’ve had it put to true on my OCR.

But still half of the text is missing. Do you see the difference?

Also: I can’t set different settings on different documents. We are developing a document management system which uses the same settings for all files. What do you suggest?

asad.ali · October 20, 2022, 3:14pm

@HauptlorenzAG

We will surely further investigate this scenario as a ticket as OCRNET-596 has been logged in our issue tracking system for this purpose. We will let you know as soon as the ticket is resolved. Please be patient and spare us some time.

We apologize for the inconvenience.

asad.ali · October 28, 2022, 7:29pm

@HauptlorenzAG

The best result you can get uses the next settings

var res = api.RecognizeImage(@"2022_10_18_08_08_18_Muster_für_eine_Bescheinigung.png", new RecognitionSettings
            {
              DetectAreasMode = DetectAreasMode.TABLE
            });

There is still some mistakes with the words placed near the table lines. We will improve this in the future releases.

HauptlorenzAG · October 31, 2022, 6:14am

Thanks for having a look at this. I assume that I can’t set DetectAreasMode to DetectAreasMode.TABLE because that will have maybe a negative impact on other Images.

asad.ali · October 31, 2022, 4:18pm

@HauptlorenzAG

Your concerns have been recorded and we will surely improve API feature in the future. We apologize for the inconvenience caused.