Evaluating aspose.ocr issue

I am evaluating Apso.ocr. I need to take a flattened pdf file and process a page with grids on it and turn that image of a grid into actual grids. I need to do this across may files and may structures.

The code In have is hung at the ocr.Recognize line of code. It runs and runs for over 10 minutes, no error code, no time-out…I have no idea what is wrong. I have coded this a few different ways, using memorystream and pdf file. Please help so I can evaluate this product.

If there is a better way to get tables from a flattened pdf file I am open to all ideas you have for accuracy and processing speed.

I can already process not flattened pdf files, so if there were a way to turn a flattened file into a data pdf file that would work as well.

Dim results As List(Of Aspose.OCR.RecognitionResult) = ocr.Recognize(input, settings)
Gets hung. No matter how I try to use it.

Here is the code I am running, and it gets hung on that line

The pdf file attacthed is a one page file with some grids on it. The file is flattened, I need to read grids on flattened files. I cant evaluate your ocr api if I cant use it.

It keeps getting hung on the results line. It just runs ans runs no errors no time out, no reason…any help would be appreciated.

Public Function ConvertFlattenedPdfPageToCsvString(pdfPath As String, pageNumber As Integer, Optional dpi As Integer = 300) As String
    ' 1) Render the page to PNG bytes
    Dim pngBytes As Byte() = RenderPageToPng(pdfPath, pageNumber, dpi)

    ' 2) Build OCR input from the PNG stream
    Dim ocr As New Aspose.OCR.AsposeOcr()
    Dim settings As New Aspose.OCR.RecognitionSettings() With {
        .DetectAreasMode = Aspose.OCR.DetectAreasMode.TABLE
    }

    Using ms As New MemoryStream(pngBytes)
        Dim input As New Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage)
        ms.Position = 0
        input.Add(ms)

        ' 3) Run OCR (returns List(Of RecognitionResult)); take the first result
        Dim results As List(Of Aspose.OCR.RecognitionResult) = ocr.Recognize(input, settings)
        If results Is Nothing OrElse results.Count = 0 Then
            Throw New InvalidOperationException("OCR returned no results.")
        End If
        Dim result As Aspose.OCR.RecognitionResult = results(0)

        ' 4) Save OCR result to XLSX in-memory
        Using xlsxStream As New MemoryStream()
            result.Save(xlsxStream, Aspose.OCR.SaveFormat.Xlsx)
            xlsxStream.Position = 0

            ' 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            Dim wb As New Workbook(xlsxStream)
            Dim csvOpts As New TxtSaveOptions(Aspose.Cells.SaveFormat.Csv) With {
                .Separator = ","c,
                .Encoding = System.Text.Encoding.UTF8,
                .AlwaysQuoted = True
            }

            Using csvStream As New MemoryStream()
                wb.Save(csvStream, csvOpts)
                Return System.Text.Encoding.UTF8.GetString(csvStream.ToArray())
            End Using
        End Using
    End Using
End Function

'--- Private: Render a single PDF page to PNG bytes at the specified DPI
Private Function RenderPageToPng(pdfPath As String,
                                 pageNumber As Integer,
                                 dpi As Integer) As Byte()
    Using doc As New Document(pdfPath)
        If pageNumber < 1 OrElse pageNumber > doc.Pages.Count Then
            Throw New ArgumentOutOfRangeException(NameOf(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.")
        End If

        Dim res As New Resolution(dpi)
        Dim device As New PngDevice(res)

        Using outMs As New MemoryStream()
            device.Process(doc.Pages(pageNumber), outMs)
            Return outMs.ToArray()
        End Using
    End Using
End Function

Best Regards,

Mase Woods
MaseW@bidtracer.com
480.734.5077

Here is the pdf file In have been working with, it is one page with a number of grids.
Grace_Lutheran.pdf (637.1 KB)

@maseyo

It seems you’re experiencing an issue with the OCR process hanging when trying to recognize text from a flattened PDF page. Here are some steps and suggestions to troubleshoot and potentially resolve the issue:

  1. Check PDF Rendering: Ensure that the PDF page is being rendered correctly to a PNG image. You can add logging to verify that the RenderPageToPng function completes successfully and returns a valid byte array.

  2. Adjust DPI Settings: A high DPI setting can sometimes lead to performance issues. Try reducing the DPI value (e.g., to 150 or 200) in the ConvertFlattenedPdfPageToCsvString function to see if it improves processing time.

  3. Use a Different Input Type: Instead of using OcrInput with InputType.SingleImage, consider using InputType.MultiImage if you plan to process multiple images at once. This can sometimes help with performance.

  4. Increase Timeout: If the OCR process is taking too long, consider implementing a timeout mechanism to avoid hanging indefinitely. You can run the OCR in a separate task and use a cancellation token.

  5. Test with Different Images: To isolate the issue, test the OCR process with different PNG images (not just from the PDF) to see if the problem persists. This can help determine if the issue is with the specific PDF or the OCR setup.

  6. Check for Updates: Ensure you are using the latest version of the Aspose.OCR library, as updates may contain performance improvements and bug fixes.

  7. Alternative Approach: If the OCR continues to hang, consider using Aspose.PDF to convert the flattened PDF to a more structured format (like DOCX) before applying OCR. This can sometimes yield better results for table extraction.

Here’s a minimal code snippet to adjust the DPI and add logging:

' Adjust DPI and add logging
Dim dpi As Integer = 150 ' Reduced DPI
Dim pngBytes As Byte() = RenderPageToPng(pdfPath, pageNumber, dpi)
If pngBytes Is Nothing OrElse pngBytes.Length = 0 Then
    Throw New InvalidOperationException("Failed to render PDF page to PNG.")
End If

If these suggestions do not resolve the issue, please provide any error messages or logs that may help further diagnose the problem.

Best of luck with your evaluation!

  1. Use a Different Input Type: Instead of using OcrInput with InputType.SingleImage, consider using InputType.MultiImage if you plan to process multiple images at once. This can sometimes help with performance.

There is no option for multiimage

I am using ocr 25.9.1.0

Can you not get me code to turn this file into the grids? What are the best practcices…I just need to e able to scroll through each row and column of each grid.

What are the best practices? Where is the documentation for this?

@maseyo

We have tested in our environment using your code snippet and could not replicate the issue. Below is the attached output and C# Converted code for your kind reference:

private static void PerformOCROnPDFTable(string dataDir)
{
    // 1) Render the page to PNG bytes
    byte[] pngBytes = RenderPageToPng(dataDir + "Grace_Lutheran.pdf", 1, 300);

    // 2) Build OCR input from the PNG stream
    var ocr = new OCR.AsposeOcr();
    var settings = new OCR.RecognitionSettings
    {
        DetectAreasMode = OCR.DetectAreasMode.TABLE
    };

    using (var ms = new MemoryStream(pngBytes))
    {
        var input = new OCR.OcrInput(OCR.InputType.SingleImage);
        ms.Position = 0;
        input.Add(ms);

        // 3) Run OCR (returns List<RecognitionResult>); take the first result
        List<OCR.RecognitionResult> results = ocr.Recognize(input, settings);
        if (results == null || results.Count == 0)
            throw new InvalidOperationException("OCR returned no results.");

        OCR.RecognitionResult result = results[0];

        // 4) Save OCR result to XLSX in-memory
        using (var xlsxStream = new MemoryStream())
        {
            result.Save(xlsxStream, OCR.SaveFormat.Xlsx);
            xlsxStream.Position = 0;

            // 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            var wb = new Aspose.Cells.Workbook(xlsxStream);
            var csvOpts = new Aspose.Cells.TxtSaveOptions(Aspose.Cells.SaveFormat.Csv)
            {
                Separator = ',',
                Encoding = Encoding.UTF8,
                AlwaysQuoted = true
            };

            using (var csvStream = new MemoryStream())
            {
                wb.Save(dataDir + "output.csv", csvOpts);
            }
        }
    }
}

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
    using (var doc = new Document(pdfPath))
    {
        if (pageNumber < 1 || pageNumber > doc.Pages.Count)
            throw new ArgumentOutOfRangeException(nameof(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.");

        var res = new Resolution(dpi);
        var device = new PngDevice(res);

        using (var outMs = new MemoryStream())
        {
            device.Process(doc.Pages[pageNumber], outMs);
            return outMs.ToArray();
        }
    }
}

output.zip (4.4 KB)

Would you kindly share your complete environment details like OS Name and Version, NET Framework, Installed RAM Size, etc.? If possible, please share a sample console application in .zip format that we can use to replicate the issue that you are facing at your end.

Please confirm that when you say above, do you mean generating CSV or Excel file? Just like you are already trying to achieve?

First off, THANK YOU! I really appreciate your help.

Unfortunately, the first thing I did was look at the output. The excel file and the grids on the pdf file are not aligned at all. I would not be able to scroll through each grid with this output.

With your Aspose.pdf api I can read each table and scroll through each row.

Is what I am looking for possible? Can your api simply take this flattened page and get me tables like pdf api does? Would it be possible to make the flattened file into a not-flattened pdf file?

Would the ocr.RecognitionResult object + result.CsvToTable() method do a better job?

Is there an AI that would do this?

I am asking for guidance as to how to get flattened table data in a usable way.

Dim ocr As New Aspose.OCR.AsposeOcr()
Dim result As Aspose.OCR.RecognitionResult = ocr.Recognize(“table.png”)

’ Convert the CSV representation of the recognized table into structured form:
Dim table As List(Of List(Of String)) = result.CsvToTable()
Dim rows As List(Of String()) = result.CsvToRows()

@maseyo

Thanks for further elaboration of your requirements. We need to investigate the feasibility at our end from Aspose.OCR perspective and for that, we will be registering this case in our issue management system. Would you kindly share a sample expected output PDF for our reference so that it can be logged with the ticket as well. We will further proceed accordingly.

Yes I’d like an object in memory, a list of tables, with rows and cells, like in the pdf api. That would be great. We could turn that into xml, json, xsxl, csv, we could anywhere from there.

@maseyo

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): OCRNET-1122

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

I dont need paid support. Your product is supposed to use to read flattened pdf files and get the tables out of them. I need that. Just regular support. I need to know I can do this before purchasing the product. Thanks, :slight_smile:

I have ocr version 25.9.1.0

ocer.DetectAreasMode does not exist: .DetectAreasMode = OCR.DetectAreasMode.TABLE
OCR.InputType does not exist: Dim input As New OCR.OcrInput(OCR.InputType.SingleImage)

ocr.SaveFormat does not exist: result.Save(xlsxStream, ocr.SaveFormat.Xlsx) and
Dim csvOpts As New TxtSaveOptions(SaveFormat.CSV) With {

Can you please let me know the workarounds and see if you can get that to work in C# so I can get it to work in vb.net…thanks.

Private Shared Sub PerformOCROnPDFTable(dataDir As String)
’ 1) Render the page to PNG bytes
Dim pngBytes As Byte() = RenderPageToPng(Path.Combine(dataDir, “Grace_Lutheran.pdf”), 1, 300)

    ' 2) Build OCR input from the PNG stream
    Dim ocr = New OCR.AsposeOcr()
    Dim settings As New OCR.RecognitionSettings() With {
        .DetectAreasMode = OCR.DetectAreasMode.TABLE
    }

    Using ms As New MemoryStream(pngBytes)
        Dim input As New OCR.OcrInput(OCR.InputType.SingleImage)
        ms.Position = 0
        input.Add(ms)

        ' 3) Run OCR (returns List(Of RecognitionResult)); take the first result
        Dim results As System.Collections.Generic.List(Of OCR.RecognitionResult) = ocr.Recognize(input, settings)
        If results Is Nothing OrElse results.Count = 0 Then
            Throw New InvalidOperationException("OCR returned no results.")
        End If

        Dim result As OCR.RecognitionResult = results(0)

        ' 4) Save OCR result to XLSX in-memory
        Using xlsxStream As New MemoryStream()
            result.Save(xlsxStream, OCR.SaveFormat.Xlsx)
            xlsxStream.Position = 0

            ' 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            Dim wb As New Workbook(xlsxStream)
            Dim csvOpts As New TxtSaveOptions(SaveFormat.CSV) With {
                .Separator = ","c,
                .Encoding = Encoding.UTF8,
                .AlwaysQuoted = True
            }

            ' Save CSV to disk (e.g., App_Data\output.csv)
            Dim outCsvPath As String = Path.Combine(dataDir, "output.csv")
            wb.Save(outCsvPath, csvOpts)
        End Using
    End Using
End Sub

Private Shared Function RenderPageToPng(pdfPath As String, pageNumber As Integer, dpi As Integer) As Byte()
    Using doc As New Document(pdfPath)
        If pageNumber < 1 OrElse pageNumber > doc.Pages.Count Then
            Throw New ArgumentOutOfRangeException(NameOf(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.")
        End If

        Dim res As New Resolution(dpi)
        Dim device As New PngDevice(res)

        Using outMs As New MemoryStream()
            device.Process(doc.Pages(pageNumber), outMs)
            Return outMs.ToArray()
        End Using
    End Using
End Function

Ok I got it to compile, but it still hangs on the Recognize line

 Dim results As System.Collections.Generic.List(Of OCR.RecognitionResult) = ocr.Recognize(input, settings)

Here is the .net code. This is a vb project…It compiles but hangs.

Samr file, same eveything.

Public Shared Sub PerformOCROnPDFTable(ByRef dataDir As String, ByRef FIleName As String)
’ 1) Render the page to PNG bytes
Dim pngBytes As Byte() = RenderPageToPng(Path.Combine(dataDir, FIleName), 1, 300)

 ' 2) Build OCR input from the PNG stream
 Dim ocr = New OCR.AsposeOcr()
 Dim settings As New OCR.RecognitionSettings() With {
     .DetectAreasMode = .DetectAreasMode.TABLE
 }

 Using ms As New MemoryStream(pngBytes)
     Dim input As New OCR.OcrInput(Aspose.OCR.InputType.SingleImage)
     ms.Position = 0
     input.Add(ms)

     ' 3) Run OCR (returns List(Of RecognitionResult)); take the first result
     Dim results As System.Collections.Generic.List(Of OCR.RecognitionResult) = ocr.Recognize(input, settings)
     If results Is Nothing OrElse results.Count = 0 Then
         Throw New InvalidOperationException("OCR returned no results.")
     End If

     Dim result As OCR.RecognitionResult = results(0)

     ' 4) Save OCR result to XLSX in-memory
     Using xlsxStream As New MemoryStream()
         result.Save(xlsxStream, Aspose.OCR.SaveFormat.Xlsx)
         xlsxStream.Position = 0

         ' 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
         Dim wb As New Workbook(xlsxStream)
         Dim csvOpts As New Aspose.Cells.TxtSaveOptions(Aspose.Cells.SaveFormat.Csv) With {
             .Separator = ","c,
             .Encoding = Encoding.UTF8,
             .AlwaysQuoted = True
         }

         ' Save CSV to disk (e.g., App_Data\output.csv)
         Dim outCsvPath As String = Path.Combine(dataDir, "output.csv")
         wb.Save(outCsvPath, csvOpts)
     End Using
 End Using

End Sub

@maseyo

We apologize for the confusion. We are not asking you to get paid support :slight_smile:. The generic message was automatically posted when a ticket was attached to this forum thread. We have already observed your requirements and noticed that API is unable to produce expected results. That is why we logged a ticket in our internal issue tracking system to analyze the scenario in details.

We will look into the details of the ticket on first come first serve basis and as soon as we have some updates regarding its resolution, we will inform you via this forum thread. Please be patient and spare us some time.

We are sorry for the inconvenience.

PS: We were not able to replicate the issue where program hangs. Can you please share a sample console application in .zip format with us that we can use to reproduce the issue?

I will Make you a smaple concole app. I just wanted to let yuou know I added a c# project to my project and call iot from thr vb project. It sitll hangs in the exact same place. The line that hanfgs i var results = ocr.Recognize(input, settings);

I tired using a memorystream and a png file that actually exists. This is quite puzzling. I am using .net version 4.7.2, required, Windows 11 Home version 10.0.26200

Ok I will make a console app in vb now.

using Aspose.Pdf;
using Aspose.Pdf.Devices;
using Aspose.Pdf.Operators;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices.ComTypes;
using System.Text;
using System.Threading.Tasks;
using System.Xml.Linq;
using Cells = Aspose.Cells;
using Ocr = Aspose.OCR;

//using Aspose.OCR;
//using Aspose.Cells;

namespace AsposeCSharp
{
public class OCRAspose
{

  public static void PerformOCROnPDFTable(string dataDir)
    {
        string pdfPath = Path.Combine(dataDir, "Grace_Lutheran.pdf");
        string outPath = Path.Combine(dataDir, "Grace_Lutheran.png");

        int pageNumber = 1;
        int dpi = 300;
        var ocr = new Ocr.AsposeOcr();

        //byte[] pngBytes = RenderPageToPng(pdfPath, pageNumber, dpi);
        RenderPageToPng(pdfPath, outPath, pageNumber, dpi);

        var settings = new Ocr.RecognitionSettings
        {
            DetectAreasMode = Ocr.DetectAreasMode.TABLE
        };

        var input = new Ocr.OcrInput(Ocr.InputType.SingleImage);
        //using (var ms = new MemoryStream(pngBytes))
        //{
        //    ms.Position = 0;
        //    input.Add(ms);
        //}
        input.Add(outPath);

        var results = ocr.Recognize(input, settings);
        var result = results[0];

        // --- Disambiguate SaveFormats explicitly ---
        using (var xlsxStream = new MemoryStream())
        {
            result.Save(xlsxStream, Ocr.SaveFormat.Xlsx);  // ← fully qualified
            xlsxStream.Position = 0;

            var wb = new Cells.Workbook(xlsxStream);
            var csvOpts = new Cells.TxtSaveOptions(Cells.SaveFormat.Csv) // ← fully qualified
            {
                Separator = ',',
                Encoding = Encoding.UTF8,
                QuoteType = Cells.TxtValueQuoteType.Always
            };

            wb.Save(Path.Combine(dataDir, "output.csv"), csvOpts);
        }
    }

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
using (var doc = new Document(pdfPath))
{
if (pageNumber < 1 || pageNumber > doc.Pages.Count)
throw new ArgumentOutOfRangeException(nameof(pageNumber),
$“Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.”);

    var res = new Resolution(dpi);
    var device = new PngDevice(res);

    using (var outMs = new MemoryStream())
    {
        device.Process(doc.Pages[pageNumber], outMs);
        return outMs.ToArray();
    }
}

}

    static void RenderPageToPng(string pdfPath, string outputPngPath, int pageNumber, int dpi)
    {
        using (Document pdfDocument = new Document(pdfPath))
        {
            if (pageNumber < 1 || pageNumber > pdfDocument.Pages.Count)
                throw new ArgumentOutOfRangeException(nameof(pageNumber),
                    $"Page {pageNumber} is out of range. Document has {pdfDocument.Pages.Count} pages.");

            Resolution resolution = new Resolution(dpi);
            PngDevice pngDevice = new PngDevice(resolution);

            using (FileStream imageStream = new FileStream(outputPngPath, FileMode.Create))
            {
                pngDevice.Process(pdfDocument.Pages[pageNumber], imageStream);
            }
        }
    }




}

}

@maseyo

Have you tried executing the application in Release mode instead of Debug? We believe it will require detailed debugging if OCR related code is in another project and being called through reference. It will be quite helpful for us if we are provided with a sample Console application (minimal) to reproduce the same issue in our environment. This way we will be able to investigate it accordingly and address it for you. We sincerely apologize for the inconvenience you have been facing.

PS: Have you been using a 30-days free temporary license for API evaluation?

Here is the c# code in another project I added to my vb project. Please feel free to run it however you need to.

I am working in the console app now in vb.net. I’ll send it along as soon as possible.

using Aspose.Pdf;
using Aspose.Pdf.Devices;
using Aspose.Pdf.Operators;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices.ComTypes;
using System.Text;
using System.Threading.Tasks;
using System.Xml.Linq;
using Cells = Aspose.Cells;
using Ocr = Aspose.OCR;

//using Aspose.OCR;
//using Aspose.Cells;

namespace AsposeCSharp
{
    public class OCRAspose
    {

      public static void PerformOCROnPDFTable(string dataDir)
        {
            string pdfPath = Path.Combine(dataDir, "Grace_Lutheran.pdf");
            string outPath = Path.Combine(dataDir, "Grace_Lutheran.png");

            int pageNumber = 1;
            int dpi = 300;
            var ocr = new Ocr.AsposeOcr();

            //byte[] pngBytes = RenderPageToPng(pdfPath, pageNumber, dpi);
            RenderPageToPng(pdfPath, outPath, pageNumber, dpi);

            var settings = new Ocr.RecognitionSettings
            {
                DetectAreasMode = Ocr.DetectAreasMode.TABLE
            };

            var input = new Ocr.OcrInput(Ocr.InputType.SingleImage);
            //using (var ms = new MemoryStream(pngBytes))
            //{
            //    ms.Position = 0;
            //    input.Add(ms);
            //}
            input.Add(outPath);

            var results = ocr.Recognize(input, settings);
            var result = results[0];

            // --- Disambiguate SaveFormats explicitly ---
            using (var xlsxStream = new MemoryStream())
            {
                result.Save(xlsxStream, Ocr.SaveFormat.Xlsx);  // ← fully qualified
                xlsxStream.Position = 0;

                var wb = new Cells.Workbook(xlsxStream);
                var csvOpts = new Cells.TxtSaveOptions(Cells.SaveFormat.Csv) // ← fully qualified
                {
                    Separator = ',',
                    Encoding = Encoding.UTF8,
                    QuoteType = Cells.TxtValueQuoteType.Always
                };

                wb.Save(Path.Combine(dataDir, "output.csv"), csvOpts);
            }
        }
        
       

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
    using (var doc = new Document(pdfPath))
    {
        if (pageNumber < 1 || pageNumber > doc.Pages.Count)
            throw new ArgumentOutOfRangeException(nameof(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.");

        var res = new Resolution(dpi);
        var device = new PngDevice(res);

        using (var outMs = new MemoryStream())
        {
            device.Process(doc.Pages[pageNumber], outMs);
            return outMs.ToArray();
        }
    }
}

        static void RenderPageToPng(string pdfPath, string outputPngPath, int pageNumber, int dpi)
        {
            using (Document pdfDocument = new Document(pdfPath))
            {
                if (pageNumber < 1 || pageNumber > pdfDocument.Pages.Count)
                    throw new ArgumentOutOfRangeException(nameof(pageNumber),
                        $"Page {pageNumber} is out of range. Document has {pdfDocument.Pages.Count} pages.");

                Resolution resolution = new Resolution(dpi);
                PngDevice pngDevice = new PngDevice(resolution);

                using (FileStream imageStream = new FileStream(outputPngPath, FileMode.Create))
                {
                    pngDevice.Process(pdfDocument.Pages[pageNumber], imageStream);
                }
            }
        }
    }
}

@maseyo

Thanks for sharing details. We are working on provided details to test the case. In the meanwhile, please do share sample console application when it is ready.

Here is the vb.net console app. Exact same code this runs. It takes a minute to run though. Also, the output file has some error code seems to be a licensing issue? No error is thrown. If you look in the pdf directory you will see 2 csv files both are identical and contain this error and not the output you got with data in it.

AsposeOCR.zip (639.5 KB)