Exception in thread "main" com.aspose.ocr.OcrException: Stream with resource file is empty or damaged. Please initialize resource file manualy

Hi ,

These are the details regarding aspose :-

Aspose Version - aspose-ocr-1.7.0-java.

Resource version - Aspose.OCR.1.9.0.Resources

My file is a .bmp file . When i run my code

package com.crawler;

import java.io.FileInputStream;
import java.io.FileNotFoundException;

import com.aspose.ocr.ILanguage;
import com.aspose.ocr.ImageStream;
import com.aspose.ocr.Language;
import com.aspose.ocr.OcrEngine;

public class AsposeOCR
{
    public static void main(String[] args) throws FileNotFoundException
    {
        OcrEngine engine = new OcrEngine();
        String imageFile = “E:\destName.bmp”;
        String resource = “E:\Aspose.OCR.1.9.0.Resources”;
        try {
                engine.setResource(new FileInputStream(resource));
                engine.setImage(ImageStream.fromFile(imageFile));
            } catch (FileNotFoundException f) {
                System.out.println(f.getMessage());
            }
        ILanguage language = Language.load(“English”);
        engine.getLanguages().addLanguage(language);
        engine.process();
        System.out.println(engine.getText());
    }
}

It gives me following error message

E:\Aspose.OCR.1.9.0.Resources (Access is denied)

Exception in thread “main” com.aspose.ocr.OcrException: Stream with resource file is empty or damaged. Please initialize resource file manualy.

at com.aspose.ocr.OcrEngine.process(Unknown Source)

at com.crawler.AsposeOCR.main(AsposeOCR.java:29)

Please suggest

Regards

Prince Philip

Hi Prince,


Thank you for contacting Aspose support.

The most probable reason for your presented scenario is that you are using incompatible version of resource archive with Aspose.OCR for Java API. Please note, each release of Aspose.OCR for Java uses a specific resource archive, and using incompatible resource file may lead to undesired results or problems as mentioned in your original post. We request you to please download the latest release of Aspose.OCR for Java 1.9.0 & its compatible resource file. Hopefully, you will not face this problem again.

Hi ,

I made changes as per your suggestion.

Now i get following error

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

at com.aspose.ocr.internal.cf.a(Unknown Source)

at com.aspose.ocr.internal.cf.c(Unknown Source)

at com.aspose.ocr.internal.cf.b(Unknown Source)

at com.aspose.ocr.internal.mk.a(Unknown Source)

at com.aspose.ocr.internal.ci.a(Unknown Source)

at com.aspose.ocr.OcrEngine.setResource(Unknown Source)

at com.crawler.AsposeOCR.main(AsposeOCR.java:22)

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

at com.aspose.ocr.internal.cf.a(Unknown Source)

at com.aspose.ocr.internal.cf.c(Unknown Source)

at com.aspose.ocr.internal.cf.b(Unknown Source)

at com.aspose.ocr.internal.mk.a(Unknown Source)

at com.aspose.ocr.internal.ci.a(Unknown Source)

at com.aspose.ocr.OcrEngine.setResource(Unknown Source)

at com.crawler.AsposeOCR.main(AsposeOCR.java:22)

Please suggest

Hi Prince,


The Aspose.OCR for Java requires the resource file to be loaded in the memory to perform the OCR operation on an image. As the resource file is large in size (80+ MB) therefore it requires more memory to properly load it in the memory. Please increase the Java Heap size to at least 512M and give it another try. In case the problem persists, please increase the Java Heap size to 1G, that would be more than sufficient.

Hi Babar ,

Thanks for the support . Code is running without any exception . But now i am getting problem with the output text. Please go through the attached image file from which i am trying to extract the text . But it is giving me text out put as "cull =" which is incorrect.

I have included my program code in previous post.

Please suggest.

regards

Prince Philip

Hi ,

Now i get this exception

Exception in thread "main"

com.aspose.ocr.OcrException: Error occurred during recognition.

at com.aspose.ocr.OcrEngine.a(Unknown Source)

at com.aspose.ocr.OcrEngine.process(Unknown Source)

at com.crawler.AsposeOCR.main(

AsposeOCR.java:34)

Caused by:

java.lang.IllegalArgumentException: width is less than or equal to 0

at com.aspose.ocr.aA.(Unknown Source)

at com.aspose.ocr.OcrEngine.a(Unknown Source)

... 3 more

Please suggest

Regards

Prince Philip

hi,

Please investigate with attached image

Regards

Hi Prince,


Thank you for sharing your sample images.

We have thoroughly evaluated the presented scenario on our end. Unfortunately, we are unable to replicate the said exception on our end while using the latest version of Aspose.OCR for Java 1.9.0 & its corresponding resource archive. However, the recognized results are not correct. We are getting the results as “cull=”. The probable reason for incorrect results is the sticky characters, means, there is too small small space between the characters. I will look further into this matter before logging a ticket in our bug tracking system. In the meanwhile, could you please test the attached image on your side to see if you still can replicate the presented exception?

Hi Babar,

Thanks for the response. I tried with sample image provided by you , it is extracting the text in the image.Working fine . But not working with my image , So please look into it and help me .

Regards

Prince Philip

Hi Prince,


We have logged the problem in our bug tracking system under the ticket OCR-33813 for further investigation & correction purposes. Please note, we have tried with different correction filters, but we were unable to get the correct results while using the latest version of Aspose.OCR for Java 1.9.0. Please spare us little time to properly analyze the problem cause, and to provide the fix at earliest possible.

Please note, we were unable to replicate the exception message stated here. In order to troubleshoot this scenario further, please confirm the JDK version used on your side.

Hi Babar,

I am using jdk1.6.0_23 on my system.

Regards

Prince Philip

Hi Prince,


We have setup a new environment on VM with Windows 7 Home Premium and JDK 1.6.0_23 to evaluate the aforesaid exception message. Unfortunately, we are still unable to replicate the error on our end. Could you please state if the message appears randomly and with a few samples?

Thank you for your cooperation.

Hi Babar ,

I reverted few changes and there are no exceptions now. So please don't think of the exceptions . I am worried about the output text that is generated with my image i.e "cull=". The code works fine with sample image provided by you .

Looking for your support.

Regards

Prince Philip

Hi Prince,


Thank you for the confirmation on the exception message.

Regarding the accuracy of recognized data, we have logged a note for the concerned development team member to schedule the ticket (OCR-33813) for thorough analysis at earliest possible, and provide an estimated release schedule for the fix. As soon as we receive more updates in this regard, we will post here for your kind reference.

Hi Friends,

Expecting an update from your side. If possible please provide me the release date planned for the above issue so that i can handle the situation in higher management .

Regards

Prince Philip

Hi Prince,


Thank you for your patience with us.

The ticket attached to this thread is scheduled for fix with the upcoming release of Aspose.OCR for Java 2.1.0. We haven’t yet scheduled the aforesaid release. However, as soon as the release having the fix for this problem is available for public use, we will notify you here with the download link to the upgraded API.

Moreover, in order to recognize your provided sample, we first have to enhance the OcrEngine in the base code. Please note, the current implementation of the OcrEngine is unable to handle the images having mingled characters as they are in your shared sample.

Hi Prince,


As discussed earlier, the reason for the incorrect recognized data against your provided sample image is the mingled (touching characters). We are currently working to provide the fix for this situation. However, we believe that the recognition of captcha images require specific algorithms as compared to the OCR for scanned documents. We have added this feature on our road map but we may not be able to implement such algorithm with Aspose.OCR APIs very soon.

The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.