Failure reading text from an scanned image

Hi,

I have scanned Passport and marked the some of the fields in OCRENGINE.AddRecognitionBlock. The OCR is not reading any of the fields from the image. For any of the text it reads “n” not more than that.

I have created one image and write my own text their and save it. Then reading the image works fine but not for scanned image.

I always will have reading scanned image of Passport, Visa, etc. Do I have solution in ASPOSE?

And also I am getting error on the below highlighted text,
ocrEngine.Languages.AddLanguage(Language.Load(“spanish”));
ocrEngine.Config.UseDefaultDictionaries = true;
using (ocrEngine.Resource = new FileStream(resourceFileName, FileMode.Open))

Thanks,
Maheswari S.

Hi Maheswari,

Thank you for your inquiry.

Please forward us the sample scanned Passport images to reproduce the issue at our end. We will investigate the issue and update you about our findings. Furthermore please use the following code snippet to use OCR engine with language other than English.

CODE:

//Initialize an instance of OcrEngine
Aspose.OCR.OcrEngine ocrEngine = new Aspose.OCR.OcrEngine();

//Set the Image property by loading the image from file path location or an instance of Stream
ocrEngine.Image = Aspose.OCR.ImageStream.FromFile(imageFile);

//Clear the default language (English)
ocrEngine.LanguageContainer.Clear();

//Load the resources of the language from file path location or an instance of Stream
ocrEngine.LanguageContainer.AddLanguage(Aspose.OCR.LanguageFactory.Load(@"LanguageResources.zip"));

//Process the image
if (ocrEngine.Process())
{
    //Display the recognized text
    Console.WriteLine(ocrEngine.Text);
}

Hi,

Thank you for your answer.

I have uploaded the scanned image as you requested.

Photo.jpg (235.9 KB)

Furthermore, I have tried with your given code sample. I don’t have “LanguageResources.zip”. I am using English language so I removed the below mentioned code and worked with remaining code even though it is not reading the correct text.

//Clear the default language (English)
ocrEngine.LanguageContainer.Clear();

//Load the resources of the language from file path location or an instance of Stream
ocrEngine.LanguageContainer.AddLanguage(Aspose.OCR.LanguageFactory.Load(@“LanguageResources.zip”));

One more problem is, in evaluation version not showing the complete text also.

Thanks,

Hi Maheswari,

Thank you for sharing sample with us.

We have evaluated the sample image. It was found that the sample image has some sort of background color. We are able to read the text using code snippet given below. However the output text has some letters missing in it. This issue has been logged into our system with ID OCRNET-3265 for further investigation.

Furthermore English is the default language. There is no need to perform extra settings for this. To perform OCR on language other than English, you have to download the resource file from link Resources and use the code snippet shared in my previous reply.

CODE:

indent preformatted text by 4 spaces
//Initialize an instance of OcrEngine
OcrEngine ocrEngine = new OcrEngine();

//Clear notifier list
ocrEngine.ClearNotifies();

//Clear recognition blocks
ocrEngine.Config.ClearRecognitionBlocks();

ocrEngine.Config.AddRecognitionBlock(RecognitionBlock.CreateTextBlock(63, 1077, 1781, 161));

//Ignore everything else on the image other than the user defined recognition blocks
ocrEngine.Config.DetectTextRegions = false;

ocrEngine.Config.ProcessColoredBackground = true;

//Set Image property by loading an image from file path
ocrEngine.Image = ImageStream.FromFile(@"ocr_test_passport.jpg");

//Run recognition process
if (ocrEngine.Process())
{
    Console.WriteLine("Text recognized: " + ocrEngine.Text);              
}

OUTPUT:

P<UTO A DERAS<<LILIA <<<<<<<<<<<<<<<<<<<<<<<
1234567 4UTO 1 14F25 1 17<<<<<<<<<<<<<< 6