Recognition Block co-ordinates unpredictable

I am trying to create a text recognition block using Aspose.OCR for Java with an evaluation license. Here is the code I am using:


ocrEngine.getConfig().addRecognitionBlock(RecognitionBlock.createTextBlock(870, 870, 730, 45));

At runtime, these are the co-ordinates used:

Block: java.awt.Rectangle[x=878,y=816,width=728,height=154]

No text is detected.
I have used the same sample code as provided in your documentation.

I would like assistance in resolving this and being able to select particular sections on which to perform OCR using your Java libraries.

Hi Susheel,

Thank you for your inquiry.

Please visit the online documentation link for details on Adding User Defined Recognition Blocks. Furthermore please provide sample image along with sample code that you are using at your end. This will help us to investigate the issue.

Code:



//Initialize an instance of OcrEngine
OcrEngine ocrEngine = new OcrEngine();
//Clear notifier list
ocrEngine.clearNotifies();

//Clear recognition blocks
ocrEngine.getConfig().clearRecognitionBlocks();

//Add 3 rectangle blocks to user defined recognition blocks
ocrEngine.getConfig().addRecognitionBlock(RecognitionBlock.createTextBlock(450, 1420, 1700, 120));

//Ignore everything else on the image other than the user defined recognition blocks
ocrEngine.getConfig().setDetectTextRegions(false);

//Set Image property by loading an image from file path
ocrEngine.setImage(ImageStream.fromFile(“Converted_Image.jpg”));
//Run recognition process
if (ocrEngine.process())
{
//Retrieve user defined blocks that determines the page layout
List blocks = ocrEngine.getConfig().getRecognitionBlocks();
//Loop over the list of blocks
for (IRecognitionBlock block : blocks)
{
//Check if block has recognition data
if (block.getRecognitionData() == null)
{
System.out.println(“Null”);
continue;
}
//Display dimension & size of rectangle that defines the recognition block
System.out.println(“Block” + block.getRectangle());
//Display the recognition results
IRecognizedTextPartInfo textPartInfo = (IRecognizedTextPartInfo)block.getRecognitionData();
System.out.println(“Text” + textPartInfo.getText());
}
}

Output:

The aee AcrobacReader is easy zo dowload md can be freely dismibured by
myone.


Attached the source image.
Although now the text blocks are in the right place, OCR is not reading the characters correctly. Can you please guide us on how to improve the output?

Hi Susheel,

Thank you for providing sample image.

You can improve the text recognition by applying different advance setting. Please visit the online documentation link Advance Configurations for details.

Hi,

Thank you for pointing me towards the advanced configurations. They helped a little, however a major issue still remains when I try to perform OCR on an image where text font sizes are different. Text which is bigger gets recognized correctly however the smaller text which makes up majority of the image does not get recognized correctly at all.

Is there any way to configure Aspose to expect a certain text size to improve the OCR results?

Thanks
Hi Susheel,

Thank you for writing us back.

Please forward us the sample image on which you want to perform OCR operation. We will analyze it at our end and update you accordingly.

I am using the Sample PDF that I had attached in my previous messages.

I have attached it in this post, along with the code that I am using and the output I am getting.

Thanks
Hi Susheel,

Thank you for sample file and details.

We have tested the scenario at our end it was found that the issue persists. The issue has been logged into our system with ID OCR-36039. Our product team will look into it. We will update you accordingly via this forum thread.

The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.