Evaluating aspose.ocr issue

I am evaluating Apso.ocr. I need to take a flattened pdf file and process a page with grids on it and turn that image of a grid into actual grids. I need to do this across may files and may structures.

The code In have is hung at the ocr.Recognize line of code. It runs and runs for over 10 minutes, no error code, no time-out…I have no idea what is wrong. I have coded this a few different ways, using memorystream and pdf file. Please help so I can evaluate this product.

If there is a better way to get tables from a flattened pdf file I am open to all ideas you have for accuracy and processing speed.

I can already process not flattened pdf files, so if there were a way to turn a flattened file into a data pdf file that would work as well.

Dim results As List(Of Aspose.OCR.RecognitionResult) = ocr.Recognize(input, settings)
Gets hung. No matter how I try to use it.

Here is the code I am running, and it gets hung on that line

The pdf file attacthed is a one page file with some grids on it. The file is flattened, I need to read grids on flattened files. I cant evaluate your ocr api if I cant use it.

It keeps getting hung on the results line. It just runs ans runs no errors no time out, no reason…any help would be appreciated.

Public Function ConvertFlattenedPdfPageToCsvString(pdfPath As String, pageNumber As Integer, Optional dpi As Integer = 300) As String
    ' 1) Render the page to PNG bytes
    Dim pngBytes As Byte() = RenderPageToPng(pdfPath, pageNumber, dpi)

    ' 2) Build OCR input from the PNG stream
    Dim ocr As New Aspose.OCR.AsposeOcr()
    Dim settings As New Aspose.OCR.RecognitionSettings() With {
        .DetectAreasMode = Aspose.OCR.DetectAreasMode.TABLE
    }

    Using ms As New MemoryStream(pngBytes)
        Dim input As New Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage)
        ms.Position = 0
        input.Add(ms)

        ' 3) Run OCR (returns List(Of RecognitionResult)); take the first result
        Dim results As List(Of Aspose.OCR.RecognitionResult) = ocr.Recognize(input, settings)
        If results Is Nothing OrElse results.Count = 0 Then
            Throw New InvalidOperationException("OCR returned no results.")
        End If
        Dim result As Aspose.OCR.RecognitionResult = results(0)

        ' 4) Save OCR result to XLSX in-memory
        Using xlsxStream As New MemoryStream()
            result.Save(xlsxStream, Aspose.OCR.SaveFormat.Xlsx)
            xlsxStream.Position = 0

            ' 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            Dim wb As New Workbook(xlsxStream)
            Dim csvOpts As New TxtSaveOptions(Aspose.Cells.SaveFormat.Csv) With {
                .Separator = ","c,
                .Encoding = System.Text.Encoding.UTF8,
                .AlwaysQuoted = True
            }

            Using csvStream As New MemoryStream()
                wb.Save(csvStream, csvOpts)
                Return System.Text.Encoding.UTF8.GetString(csvStream.ToArray())
            End Using
        End Using
    End Using
End Function

'--- Private: Render a single PDF page to PNG bytes at the specified DPI
Private Function RenderPageToPng(pdfPath As String,
                                 pageNumber As Integer,
                                 dpi As Integer) As Byte()
    Using doc As New Document(pdfPath)
        If pageNumber < 1 OrElse pageNumber > doc.Pages.Count Then
            Throw New ArgumentOutOfRangeException(NameOf(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.")
        End If

        Dim res As New Resolution(dpi)
        Dim device As New PngDevice(res)

        Using outMs As New MemoryStream()
            device.Process(doc.Pages(pageNumber), outMs)
            Return outMs.ToArray()
        End Using
    End Using
End Function

Best Regards,

Mase Woods
MaseW@bidtracer.com
480.734.5077

Here is the pdf file In have been working with, it is one page with a number of grids.
Grace_Lutheran.pdf (637.1 KB)

@maseyo

It seems you’re experiencing an issue with the OCR process hanging when trying to recognize text from a flattened PDF page. Here are some steps and suggestions to troubleshoot and potentially resolve the issue:

  1. Check PDF Rendering: Ensure that the PDF page is being rendered correctly to a PNG image. You can add logging to verify that the RenderPageToPng function completes successfully and returns a valid byte array.

  2. Adjust DPI Settings: A high DPI setting can sometimes lead to performance issues. Try reducing the DPI value (e.g., to 150 or 200) in the ConvertFlattenedPdfPageToCsvString function to see if it improves processing time.

  3. Use a Different Input Type: Instead of using OcrInput with InputType.SingleImage, consider using InputType.MultiImage if you plan to process multiple images at once. This can sometimes help with performance.

  4. Increase Timeout: If the OCR process is taking too long, consider implementing a timeout mechanism to avoid hanging indefinitely. You can run the OCR in a separate task and use a cancellation token.

  5. Test with Different Images: To isolate the issue, test the OCR process with different PNG images (not just from the PDF) to see if the problem persists. This can help determine if the issue is with the specific PDF or the OCR setup.

  6. Check for Updates: Ensure you are using the latest version of the Aspose.OCR library, as updates may contain performance improvements and bug fixes.

  7. Alternative Approach: If the OCR continues to hang, consider using Aspose.PDF to convert the flattened PDF to a more structured format (like DOCX) before applying OCR. This can sometimes yield better results for table extraction.

Here’s a minimal code snippet to adjust the DPI and add logging:

' Adjust DPI and add logging
Dim dpi As Integer = 150 ' Reduced DPI
Dim pngBytes As Byte() = RenderPageToPng(pdfPath, pageNumber, dpi)
If pngBytes Is Nothing OrElse pngBytes.Length = 0 Then
    Throw New InvalidOperationException("Failed to render PDF page to PNG.")
End If

If these suggestions do not resolve the issue, please provide any error messages or logs that may help further diagnose the problem.

Best of luck with your evaluation!

  1. Use a Different Input Type: Instead of using OcrInput with InputType.SingleImage, consider using InputType.MultiImage if you plan to process multiple images at once. This can sometimes help with performance.

There is no option for multiimage

I am using ocr 25.9.1.0

Can you not get me code to turn this file into the grids? What are the best practcices…I just need to e able to scroll through each row and column of each grid.

What are the best practices? Where is the documentation for this?

@maseyo

We have tested in our environment using your code snippet and could not replicate the issue. Below is the attached output and C# Converted code for your kind reference:

private static void PerformOCROnPDFTable(string dataDir)
{
    // 1) Render the page to PNG bytes
    byte[] pngBytes = RenderPageToPng(dataDir + "Grace_Lutheran.pdf", 1, 300);

    // 2) Build OCR input from the PNG stream
    var ocr = new OCR.AsposeOcr();
    var settings = new OCR.RecognitionSettings
    {
        DetectAreasMode = OCR.DetectAreasMode.TABLE
    };

    using (var ms = new MemoryStream(pngBytes))
    {
        var input = new OCR.OcrInput(OCR.InputType.SingleImage);
        ms.Position = 0;
        input.Add(ms);

        // 3) Run OCR (returns List<RecognitionResult>); take the first result
        List<OCR.RecognitionResult> results = ocr.Recognize(input, settings);
        if (results == null || results.Count == 0)
            throw new InvalidOperationException("OCR returned no results.");

        OCR.RecognitionResult result = results[0];

        // 4) Save OCR result to XLSX in-memory
        using (var xlsxStream = new MemoryStream())
        {
            result.Save(xlsxStream, OCR.SaveFormat.Xlsx);
            xlsxStream.Position = 0;

            // 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            var wb = new Aspose.Cells.Workbook(xlsxStream);
            var csvOpts = new Aspose.Cells.TxtSaveOptions(Aspose.Cells.SaveFormat.Csv)
            {
                Separator = ',',
                Encoding = Encoding.UTF8,
                AlwaysQuoted = true
            };

            using (var csvStream = new MemoryStream())
            {
                wb.Save(dataDir + "output.csv", csvOpts);
            }
        }
    }
}

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
    using (var doc = new Document(pdfPath))
    {
        if (pageNumber < 1 || pageNumber > doc.Pages.Count)
            throw new ArgumentOutOfRangeException(nameof(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.");

        var res = new Resolution(dpi);
        var device = new PngDevice(res);

        using (var outMs = new MemoryStream())
        {
            device.Process(doc.Pages[pageNumber], outMs);
            return outMs.ToArray();
        }
    }
}

output.zip (4.4 KB)

Would you kindly share your complete environment details like OS Name and Version, NET Framework, Installed RAM Size, etc.? If possible, please share a sample console application in .zip format that we can use to replicate the issue that you are facing at your end.

Please confirm that when you say above, do you mean generating CSV or Excel file? Just like you are already trying to achieve?

First off, THANK YOU! I really appreciate your help.

Unfortunately, the first thing I did was look at the output. The excel file and the grids on the pdf file are not aligned at all. I would not be able to scroll through each grid with this output.

With your Aspose.pdf api I can read each table and scroll through each row.

Is what I am looking for possible? Can your api simply take this flattened page and get me tables like pdf api does? Would it be possible to make the flattened file into a not-flattened pdf file?

Would the ocr.RecognitionResult object + result.CsvToTable() method do a better job?

Is there an AI that would do this?

I am asking for guidance as to how to get flattened table data in a usable way.

Dim ocr As New Aspose.OCR.AsposeOcr()
Dim result As Aspose.OCR.RecognitionResult = ocr.Recognize(“table.png”)

’ Convert the CSV representation of the recognized table into structured form:
Dim table As List(Of List(Of String)) = result.CsvToTable()
Dim rows As List(Of String()) = result.CsvToRows()

@maseyo

Thanks for further elaboration of your requirements. We need to investigate the feasibility at our end from Aspose.OCR perspective and for that, we will be registering this case in our issue management system. Would you kindly share a sample expected output PDF for our reference so that it can be logged with the ticket as well. We will further proceed accordingly.