Exception raised to read text from an image

Hi,


I am trying to execute below code to read text from an image but I am getting one exception

Here is the code:
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package testing;

import com.aspose.ocr.ILanguage;
import com.aspose.ocr.ImageStream;
import com.aspose.ocr.Language;
import com.aspose.ocr.OcrEngine;
import java.io.FileInputStream;
import java.io.FileNotFoundException;

/**
*
* @author gopi
*/
public class ReadTextfrImage {

public static String myDir = “/home/gopi/iDosti”;

public static void main(String[] args) throws FileNotFoundException {
int mb = 1024 * 1024;
Runtime r =Runtime.getRuntime();
System.out.println(r.totalMemory() / mb);
// Set the paths
String imagePath = “/home/gopi/iDosti/OCR/image/0-9.bmp”;
String resourcesFolderPath = “/home/gopi/iDosti/OCR/resouce/Aspose.OCR.1.5.0.Resources.zip”;

// Create an instance of OcrEngine
OcrEngine ocr = new OcrEngine();
// Set Resources for OcrEngine
ocr.setResource(new FileInputStream(resourcesFolderPath));
// Set NeedRotationCorrection property to false
ocr.getConfig().setNeedRotationCorrection(false);

// Set image file
ocr.setImage(ImageStream.fromFile(imagePath));

// Add language
ILanguage language = Language.load(“english”);
ocr.getLanguages().addLanguage(language);

// Perform OCR and get extracted text
try {
if (ocr.process()) {
System.out.println("\ranswer -> " + ocr.getText());
}
} catch (Exception e) {
e.printStackTrace();
}
}
}


I am using Linux Fedora 18 and Ram 8GB

Netbeans 7.4

Exception is :
Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
at com.aspose.internal.ocr.r.D.a(Unknown Source)
at com.aspose.internal.ocr.r.D.c(Unknown Source)
at com.aspose.internal.ocr.r.D.write(Unknown Source)
at com.aspose.internal.ocr.ai.c.a(Unknown Source)
at com.aspose.internal.ocr.r.M.fromJava(Unknown Source)
at com.aspose.ocr.OcrEngine.setResource(Unknown Source)
at testing.ReadTextfrImage.main(ReadTextfrImage.java:35)
Java Result: 1

I added two jar files those are :
aspose-ocr-1.5-jdk14.jar
aspose-ocr-1.5-jdk15.jar

Can u plaese tell me why I am getting exception…

Thank you

Hi Chandu,

Thank you for using Aspose products, and welcome to Aspose.OCR support forum.

We have recently published latest version of Aspose.OCR for Java 1.7.0 with improved memory utilization and several other enhancements. As you are using an older version (1.5.0) therefore we would suggest you to download the latest build (link shared above) and its corresponding resource file in order to check if you still can reproduce the said exception.

In case the problem persists, we would request you to share the sample image for detailed investigation and correction purposes. Please feel free to write back if you face any difficulty.

Hi


I upgraded to Aspose.OCR.1.7.0 but I am getting a new exception

Code is :

/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package testing;

import com.aspose.ocr.ILanguage;
import com.aspose.ocr.ImageStream;
import com.aspose.ocr.Language;
import com.aspose.ocr.OcrEngine;
import java.io.FileInputStream;
import java.io.FileNotFoundException;

/**
*
* @author gopi
*/
public class ReadTextfrImage {

public static String myDir = “/home/gopi/iDosti”;

public static void main(String[] args) throws FileNotFoundException {
int mb = 1024 * 1024;
Runtime r =Runtime.getRuntime();
System.out.println(r.totalMemory() / mb);
// Set the paths
String imagePath = “/home/gopi/iDosti/testImage.png”;
String resourcesFolderPath = “/home/gopi/Softwares/Aspose.OCR.1.7.0.Resources.zip”;

// Create an instance of OcrEngine
OcrEngine ocr = new OcrEngine();
// Set Resources for OcrEngine
ocr.setResource(new FileInputStream(resourcesFolderPath));
// Set NeedRotationCorrection property to false
ocr.getConfig().setNeedRotationCorrection(false);

// Set image file
ocr.setImage(ImageStream.fromFile(imagePath));

// Add language
ILanguage language = Language.load(“english”);
ocr.getLanguages().addLanguage(language);

// Perform OCR and get extracted text
try {
if (ocr.process()) {
System.out.println("\ranswer -> " + ocr.getText());
}
} catch (Exception e) {
e.printStackTrace();
}
}
}


New Exception :

com.aspose.ocr.OcrException: Error occurred during recognition.
at com.aspose.ocr.OcrEngine.a(Unknown Source)
at com.aspose.ocr.OcrEngine.process(Unknown Source)
at testing.ReadTextfrImage.main(ReadTextfrImage.java:48)
Caused by: class com.aspose.ocr.internal.d: FontFamily Arial not found
Parameter name: Arial
com.aspose.ocr.internal.aE.(Unknown Source)
com.aspose.ocr.internal.aE.(Unknown Source)
com.aspose.ocr.internal.sA.(Unknown Source)
com.aspose.ocr.aU.a(Unknown Source)
com.aspose.ocr.aK.a(Unknown Source)
com.aspose.ocr.aK.a(Unknown Source)
com.aspose.ocr.ag.a(Unknown Source)
com.aspose.ocr.ag.a(Unknown Source)
com.aspose.ocr.ay.a(Unknown Source)
com.aspose.ocr.ay.a(Unknown Source)
com.aspose.ocr.ai.b(Unknown Source)
com.aspose.ocr.ai.a(Unknown Source)
com.aspose.ocr.ai.b(Unknown Source)
com.aspose.ocr.ai.d(Unknown Source)
com.aspose.ocr.ai.a(Unknown Source)
com.aspose.ocr.OcrEngine.a(Unknown Source)
com.aspose.ocr.OcrEngine.process(Unknown Source)
testing.ReadTextfrImage.main(ReadTextfrImage.java:48)
at com.aspose.ocr.internal.aE.(Unknown Source)
at com.aspose.ocr.internal.aE.(Unknown Source)
at com.aspose.ocr.internal.sA.(Unknown Source)
at com.aspose.ocr.aU.a(Unknown Source)
at com.aspose.ocr.aK.a(Unknown Source)
at com.aspose.ocr.aK.a(Unknown Source)
at com.aspose.ocr.ag.a(Unknown Source)
at com.aspose.ocr.ag.a(Unknown Source)
at com.aspose.ocr.ay.a(Unknown Source)
at com.aspose.ocr.ay.a(Unknown Source)
at com.aspose.ocr.ai.b(Unknown Source)
at com.aspose.ocr.ai.a(Unknown Source)
at com.aspose.ocr.ai.b(Unknown Source)
at com.aspose.ocr.ai.d(Unknown Source)
at com.aspose.ocr.ai.a(Unknown Source)
… 3 more

Added jar files :

aspose-ocr-1.7-jdk16.jar
aspose-omr-1.7-jdk16.jar

Thank you…

Hi Chandu,

Thank you for writing back.

I have used your provided code to successfully perform OCR operation on a sample of my own (attached). Problem could be the sample it self so I would request you to please provide the sample image so we could investigate the problem in more detail. Moreover, please note that Aspose.OCR for Java currently supports BMP and TIFF file formats. Other image formats (including PNG) are not supported at the moment.

Hi,


I used your attached sampleocr bmp image but same exception coming again

Exception is :

com.aspose.ocr.OcrException: Error occurred during recognition.
at com.aspose.ocr.OcrEngine.a(Unknown Source)
at com.aspose.ocr.OcrEngine.process(Unknown Source)
at testing.ReadTextfrImage.main(ReadTextfrImage.java:50)
Caused by: class com.aspose.ocr.internal.d: FontFamily Arial not found
Parameter name: Arial
com.aspose.ocr.internal.aE.(Unknown Source)
com.aspose.ocr.internal.aE.(Unknown Source)
com.aspose.ocr.internal.sA.(Unknown Source)
com.aspose.ocr.aU.a(Unknown Source)
com.aspose.ocr.aK.a(Unknown Source)
com.aspose.ocr.aK.a(Unknown Source)
com.aspose.ocr.ag.a(Unknown Source)
com.aspose.ocr.ag.a(Unknown Source)
com.aspose.ocr.ay.a(Unknown Source)
com.aspose.ocr.ay.a(Unknown Source)
com.aspose.ocr.ai.b(Unknown Source)
com.aspose.ocr.ai.a(Unknown Source)
com.aspose.ocr.ai.b(Unknown Source)
com.aspose.ocr.ai.d(Unknown Source)
com.aspose.ocr.ai.a(Unknown Source)
com.aspose.ocr.OcrEngine.a(Unknown Source)
com.aspose.ocr.OcrEngine.process(Unknown Source)
testing.ReadTextfrImage.main(ReadTextfrImage.java:50)
at com.aspose.ocr.internal.aE.(Unknown Source)
at com.aspose.ocr.internal.aE.(Unknown Source)
at com.aspose.ocr.internal.sA.(Unknown Source)
at com.aspose.ocr.aU.a(Unknown Source)
at com.aspose.ocr.aK.a(Unknown Source)
at com.aspose.ocr.aK.a(Unknown Source)
at com.aspose.ocr.ag.a(Unknown Source)
at com.aspose.ocr.ag.a(Unknown Source)
at com.aspose.ocr.ay.a(Unknown Source)
at com.aspose.ocr.ay.a(Unknown Source)
at com.aspose.ocr.ai.b(Unknown Source)
at com.aspose.ocr.ai.a(Unknown Source)
at com.aspose.ocr.ai.b(Unknown Source)
at com.aspose.ocr.ai.d(Unknown Source)
at com.aspose.ocr.ai.a(Unknown Source)
… 3 more

and Code is :

/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package testing;

import com.aspose.ocr.ILanguage;
import com.aspose.ocr.ImageStream;
import com.aspose.ocr.Language;
import com.aspose.ocr.OcrEngine;
import java.io.FileInputStream;
import java.io.FileNotFoundException;



/**
*
* @author gopi
*/
public class ReadTextfrImage {

public static String myDir = “/home/gopi/iDosti”;

public static void main(String[] args) throws FileNotFoundException {
int mb = 1024 * 1024;
Runtime r =Runtime.getRuntime();
System.out.println(r.totalMemory() / mb);
// Set the paths
String imagePath = “/home/gopi/iDosti/OCR/image/Sampleocr.bmp”;
String resourcesFolderPath = “/home/gopi/iDosti/OCR/resouce/Aspose.OCR.1.7.0.Resources.zip”;

// Create an instance of OcrEngine
OcrEngine ocr = new OcrEngine();
// Set Resources for OcrEngine
ocr.setResource(new FileInputStream(resourcesFolderPath));
// Set NeedRotationCorrection property to false
ocr.getConfig().setNeedRotationCorrection(false);

// Set image file
ocr.setImage(ImageStream.fromFile(imagePath));

// Add language
ILanguage language = Language.load(“english”);
ocr.getLanguages().addLanguage(language);

// Perform OCR and get extracted text
try {
if (ocr.process()) {
System.out.println("\ranswer -> " + ocr.getText());
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

Hi Chandu,

We are sorry for your inconvenience.

We are currently looking into the exception message to suggest a solution. In the meanwhile, please provide your environment details such as Operating System, Service Pack, JDK version etc. As soon as we get these details, we will try to simulate the environment on our to replicate the exception.

On a similar note, could you please try extracting the Resource archive to make sure the downloaded package isn’t corrupt/damaged.

Hi,


I am using linux fedora 18, jdk1.7 and netbeans7.4

Thank you

Hi Chandu,

Thank you for providing the environment details. We are moving forward to setup Linux Fedora 20 (18 isn’t available for download) on our end in order to further investigate the problem cause. Moreover, a ticket (OCR-33694) has been logged in our bug tracking system to get an insight of the exception “Error occurred during recognition.FontFamily Arial not found Parameter name: Arial”.

We will keep you posted with updates in this regard. Please accept our sincere apologies for the inconvenience caused.

Hi Chandu,

Thank you for your patience with us.

We have evaluated your presented scenario on Linux Fedora 20, and was able to replicate the said exception. The said exception is due to the reason that Aspose.OCR for Java API did
not find the required font(s) therefore it threw an error
mentioning the font used in the sample image. Please note, Linux distributions such as Fedora does not include any Microsoft TrueType fonts (TTFs) by default, whereas Aspose.OCR for Java requires these fonts to be installed on the machine where OCR operation has to be performed.

Please follow the instructions provided below to install the Microsoft TTFs package including the following font-families:

  • Andale Mono
  • Arial Black/Arial (Bold, Italic, Bold Italic)
  • Comic Sans MS (Bold)
  • Courier New (Bold, Italic, Bold Italic)
  • Georgia (Bold, Italic, Bold Italic)
  • Impact
  • Tahoma
  • Times New Roman (Bold, Italic, Bold Italic)
  • Trebuchet (Bold, Italic, Bold Italic)
  • Verdana (Bold, Italic, Bold Italic)
  • Webdings

Step-1: Make sure you have the following rpm-packages installed.

  1. rpm-build
    • If not previously installed, use the command yum install rpm-build cabextract ttmkfdir
  2. wget
    • If not previously installed, use the command yum -y install wget

Step-2: Download the latest msttcorefonts spec file from SourceForge using the command as follow,

wget http://corefonts.sourceforge.net/msttcorefonts-2.5-1.spec

Step-3: Build a RPM file using the previously downloaded spec file and the following command,

rpmbuild -ba msttcorefonts-2.5-1.spec

Step-4: The RPM file will be stored in: /root/rpmbuild/RPMS/noarch/, install it as follow,

rpm -ivh /root/rpmbuild/RPMS/noarch/msttcorefonts-2.5-1.noarch.rpm

You are all done, just restart the machine to make changes take effect. Re-run your previously provided code and it will be able to exhibit the desired results.

Hi Chandu,


Just to keep you informed that we have closed the ticket logged earlier as OCR-33694 because the behavior presented in this ticket is not a bug on the part of Aspose.OCR for Java API, rather the issue is related to the configuration of fonts in the development environment. Please feel free to contact us anytime if you have any concerns or questions regarding Aspose APIs.

The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.