How to recognize Times New Roman

I’m doing some test with our Company Forms but I get only illegible strings. I have tried with different file formats, DPIs, color and B/W, etc… but I could’ get more than a few character properly identified.

Then I have created a test image using paint writing some string in Times New Roman (the font we use in out forms) and Arial. Only Arial text is properly extracted.

Attached you can find the image I’m using for testing and this is the text I get : “M tTRF TiA47R mW RahqAM m ntm FanhoRnFADFRS ARF mztTTFN m R1-nnRF.An m JMRF.RS r-lKF. 45345

(JR l 231 X444

THSSARAL18 REFERENCE584697412A NAME lw ARIAL 98”

Must I programmatically select the font I’m going to read before process the image?

Thank you

Here you have the code I use for testing:

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

Dim dataDir As String = Path.GetFullPath("../../../Data/")

' Resource file
Const resourceFileName As String = "C:\Desarrollo\TFS_ONLINE\OCRFomularios\Aspose.OCR.Resources.zip"
' Source file: the file on which OCR will be performed
Dim imageFile As String = "P:\FORMS\FONTEST.JPG"

Dim license As Aspose.OCR.License = New Aspose.OCR.License()
license.SetLicense("C:\Desarrollo\TFS_ONLINE\OCRFomularios\Aspose.OCR.lic")

' Initialize OcrEngine
Dim ocr As New OcrEngine()
' Set the image
ocr.Image = ImageStream.FromFile(imageFile)
' Add language
ocr.Languages.AddLanguage(Language.Load("english"))
' ocr.Languages.AddLanguage(Language.Load("spanish"))
' Load the resource file
ocr.Resource = New FileStream(resourceFileName, FileMode.Open)
Try
' Process the whole image
If ocr.Process() Then
' Get the complete recognized text found from the image
Console.WriteLine("Text recognized: " & ocr.Text.ToString())
File.WriteAllText(dataDir & "P:\FORMS\Output.txt", CType(ocr.Text, Object).ToString())
End If
Catch ex As Exception
Console.WriteLine("Exception: " & ex.ToString())
End Try
End Sub

Hi Fernando,


Thank you for contacting Aspose support.

We have evaluated the presented scenario while using the latest version of Aspose.OCR for .NET 2.0.0, and we are able to observe the problem of incorrect recognized data. Please note, the API does not require setting the fonts face name to recognize the text rendered in a particular font therefore you code is correct by all means. At the moment, we are not sure what could be causing the issue so we have logged the issue in our bug tracking system under the ticket OCR-33860 for further investigation. Please spare us little time to properly analyze the problem cause our end. In the meanwhile, we will keep you posted with updates in this regard.

We are sorry for the inconvenience caused to you.

The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.