Hello guys,
I am using the latest version 3.5.0 of the OCR .NET assembly. In the software that I am writing, I currently use Puma.NET as it works relatively fast and has quite reliable results most of the times. We bought Aspose .NET OCR version because Puma.NET is not supported and we sometimes have hard crashes with some documents.
Now, I have tried to implement the Aspose version since a few weeks now, but it does only recognize the easiest documents (Word docs saved as jpg for example), and it requires at least a minute per page, even for only a few lines of text, which is way too long. I probably have forgotten something but I load the proper language file and I get a result, so I'm quite puzzled.
Here is my code. Please assist and TIA! :)
PROCEDURE plf_Import_OCR()
blm_Result is boolean
sgf_File_OCR = CompleteDir(fExeDir()) + "Pain.jpg" // This is a plain Word document converted to JPG and containing 12 lines of text. It takes more than one minute to recognize and the recognition is 90% Ok
// is a commented line
//sgf_File_OCR = CompleteDir(fExeDir()) + "SpanishOCR.bmp"
//sgf_File_OCR = CompleteDir(fExeDir()) + "SpanishOCR.bmp"
IF NOT fFileExist(sgf_File_OCR) THEN
STOP
RETURN
END
STOP
RETURN
END
ogf_Engine_OCR = new "Aspose.OCR".OcrEngine
ogf_Engine_OCR.Image = ImageStream.FromFile(sgf_File_OCR)
// Language
// Clear the default language (English)
ogf_Engine_OCR.LanguageContainer.Clear()
// Load the resources of the language from file path location or an instance of Stream
ogf_Engine_OCR.LanguageContainer.AddLanguage(LanguageFactory.Load("French_language_resource_file_for_Aspose.OCR_for_.NET_3.2.0.zip"))
// Added the following out of despair I guess...
ogf_Engine_OCR.Config.RemoveNonText=True
ogf_Engine_OCR.Config.DetectReadingOrder=True
ogf_Engine_OCR.Config.DetectTextRegions=True
ogf_Engine_OCR.Config.DoSpellingCorrection=True
ogf_Engine_OCR.Config.DetectReadingOrder=True
ogf_Engine_OCR.Config.DetectTextRegions=True
ogf_Engine_OCR.Config.DoSpellingCorrection=True
// Filters (tried them but with no improvement so I "commented" them)
//ogf_Filters = new CorrectionFilters()
//ogf_Filters = new CorrectionFilters()
//blm_Filter1 is Medianfilter(5)
//ogf_filters.add(blm_Filter1)
//blm_Filter2 is GaussBlurFilter()
//ogf_Filters.add(blm_Filter2)
//blm_Filter3 is RemoveNoiseFilter()
//ogf_Filters.add(blm_Filter3)
blm_Result = ogf_Engine_OCR.Process()
IF NOT blm_Result THEN
STOP
RETURN
END
STOP
RETURN
END
sgf_Text_OCR = ogf_Engine_OCR.Text.ToString()
Info("Lecture OCR : " + CR + sgf_Text_OCR)
ogf_Engine_OCR.Dispose()