Cannot find OcrEngine

jerryscannell · August 22, 2021, 1:27pm

I just installed Aspose.OCR via nuget and am trying to use it.

I have run into a couple of issues. The biggest one is when I attempted to create an instance of OcrEngine, it doesn’t seem to exist within the Aspose.OCR namespace.

I also was unable to use “RecognizeImage” due to some runtime issue so that’s why I tried the OcrEngine method that I found in your sample page.
Here is the code I am trying:

    public string convertPDF_2_Text ( string strFullPath )
    {
        string strResults = "";

        //AsposeOcr api = new AsposeOcr();

        // Recognize image
        //strResults = api.RecognizeImage ( strFullPath );


        // Initialize an instance of OcrEngine
        Aspose.OCR.OcrEngine ocrEngine = new Aspose.OCR.OcrEngine();

        // Set the Image property by loading the image from file path location or an instance of MemoryStream 
        ocrEngine.Image = ImageStream.FromFile ( strFullPath );

        // Process the image
        if (ocrEngine.Process())
            {
            // Display the recognized text
            strResults = ocrEngine.Text;
            }

        return ( strResults );
    }

asad.ali · August 23, 2021, 6:18pm

@jerryscannell

The OcrEngine is an old component of the API that has been discontinued. AsposeOcr is the new Class that can be used to perform OCR on an image using the latest version of the API. Could you please make sure that your application is using .NET Framework higher than 4.6 and debugging mode is x64 (Not AnyCPU or x86). Please try to create a new console application with these settings and install the API again. In case the issue still persists, please let us know.

jerryscannell · August 23, 2021, 10:47pm

Thank you. I will try that. Is there any documentation showing examples of paying OCR processes from a PDF or graphic file like jpg, gif, and bmp?

asad.ali · August 24, 2021, 9:05pm

@jerryscannell

We already have shared a link to documentation article with you that shows a basic operation method to perform OCR on an image. Can you please specify more about what you are looking for so that we can share related information with you?

jerryscannell · August 24, 2021, 10:05pm

I believe that I will be able to find that. Thanks.

jerryscannell · August 25, 2021, 2:05pm

Asad,

I tried the code found in the sample page, but I’m still getting the same runtime error. I can’t find a place to attach screenshots, so I will type the gist of the errors. It actually threw 2 exceptions:

TypeInitializationException: The type initializer for 'Microsoft.ML.OnnxRuntime.NativeMethods threw an exception.
EntryPointNotFoundException: Unable to find an entry point named ;OrtGetApiBase’ in dll ‘onrxruntime’

Here is the code I am using. The file that I am parsing is a .png file:
public string convertImage_2_Text ( string strFullPath )
{
string strResults = “”;
AsposeOcr api = new AsposeOcr();

        // Recognize image
        //strResults = api.RecognizeImage ( strFullPath );

        using (MemoryStream ms = new MemoryStream())
            using (FileStream file = new FileStream ( strFullPath, FileMode.Open, FileAccess.Read))
                {
                file.CopyTo(ms);
                strResults = api.RecognizeImage(ms);
                }

        return ( strResults );
    }

The exceptions occur on the api.RecognizeImage(ms)

jerryscannell · August 25, 2021, 2:55pm

I have an update. I had installed the package into my .dll that is written in C# since all the examples are in C#. I didn’t, however, install the package onto my base project (the one that produces the .exe).

On a whim I decided to install the package there and when finished was able to process the .png file.

However, it didn’t process the entire file because there is a line that says " ************* Trial Licenses ************* ". I didn’t realize that this download was for a trial basis.

Secondly, it didn’t translate all of the text correctly. Is there a way for me to send screenshots to you? All I can do right now is show you the converted text: b 28. 2Ü21 D Michaël J Gēēnbēfū $50.J $D.O
that should have been: Feb. 28, 2021 D Michael J Greenburg $50.00 $0.00

Thanks,
Paul

asad.ali · August 25, 2021, 9:29pm

@jerryscannell

You can please use a 30-days free temporary license in order to evaluate the API without any restrictions.

Furthermore, you can surely attach the screenshots to your post while using the upload button in the post editor. uploadfile.png (7.0 KB)

If you are processing an image with non-English characters, you need to specify the language in RecognitionSettings like following:

Aspose.OCR.AsposeOcr api = new OCR.AsposeOcr();
// set the language
var result = api.RecognizeImage(dataDir + "modele-de-facture.jpg", new Aspose.OCR.RecognitionSettings { Language = Aspose.OCR.Language.Fra });
result.Save(dataDir + "recResults.txt", Aspose.OCR.SaveFormat.Text);

jerryscannell · August 26, 2021, 1:33am

It is English, but the software missed a lot of it. Any suggestions on how to improve the OCR quality?

jerryscannell · August 26, 2021, 3:03pm

I think I know what might have been the problem with the image I tried to parse. All of the text had a colored background. That made it impossible for the software to do its job. I ran the "RecognizeImage against a document that had white background and test and it performed flawlessly! I am very impressed with this product.

Please see the attached document

You said I could get a 30-day full license so I could really run it through its paces. how do I do that, exactly?

Thanks,
Paul
AsposeOCR.docx (139.8 KB)

jerryscannell · August 26, 2021, 7:18pm

I have made great strides with this. I have a 30-day license so I am doing R & D to figure out what I need to do to extract text from a PDF file with many pages into a single .txt file that I can then use for keyword searches.

I do have one additional technical question for you, though.

Since I will have to run the AsposeOCR process against several image files, I was wondering if I could create a single instance of “AsposeOcr”; apply my license to it; then call the RecognizeImage() function for each image; then dispose of the instance when I’m done.

Thanks so much for all your help,
Paul

asad.ali · August 26, 2021, 9:37pm

@jerryscannell

Thanks for your feedback and showing interest in our API.

You can surely do so. In fact, it is a recommended practice to initialize and set the license only once. You can set the license in the start-up method of your application and use the API method for as long as the application is running. You would not have to set the license while calling the RecognizeImage() method every time. Please feel free to let us know in case you need further assistance.