Evaluating aspose.ocr issue

Can you not get me code to turn this file into the grids? What are the best practcices…I just need to e able to scroll through each row and column of each grid.

What are the best practices? Where is the documentation for this?

@maseyo

We have tested in our environment using your code snippet and could not replicate the issue. Below is the attached output and C# Converted code for your kind reference:

private static void PerformOCROnPDFTable(string dataDir)
{
    // 1) Render the page to PNG bytes
    byte[] pngBytes = RenderPageToPng(dataDir + "Grace_Lutheran.pdf", 1, 300);

    // 2) Build OCR input from the PNG stream
    var ocr = new OCR.AsposeOcr();
    var settings = new OCR.RecognitionSettings
    {
        DetectAreasMode = OCR.DetectAreasMode.TABLE
    };

    using (var ms = new MemoryStream(pngBytes))
    {
        var input = new OCR.OcrInput(OCR.InputType.SingleImage);
        ms.Position = 0;
        input.Add(ms);

        // 3) Run OCR (returns List<RecognitionResult>); take the first result
        List<OCR.RecognitionResult> results = ocr.Recognize(input, settings);
        if (results == null || results.Count == 0)
            throw new InvalidOperationException("OCR returned no results.");

        OCR.RecognitionResult result = results[0];

        // 4) Save OCR result to XLSX in-memory
        using (var xlsxStream = new MemoryStream())
        {
            result.Save(xlsxStream, OCR.SaveFormat.Xlsx);
            xlsxStream.Position = 0;

            // 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            var wb = new Aspose.Cells.Workbook(xlsxStream);
            var csvOpts = new Aspose.Cells.TxtSaveOptions(Aspose.Cells.SaveFormat.Csv)
            {
                Separator = ',',
                Encoding = Encoding.UTF8,
                AlwaysQuoted = true
            };

            using (var csvStream = new MemoryStream())
            {
                wb.Save(dataDir + "output.csv", csvOpts);
            }
        }
    }
}

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
    using (var doc = new Document(pdfPath))
    {
        if (pageNumber < 1 || pageNumber > doc.Pages.Count)
            throw new ArgumentOutOfRangeException(nameof(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.");

        var res = new Resolution(dpi);
        var device = new PngDevice(res);

        using (var outMs = new MemoryStream())
        {
            device.Process(doc.Pages[pageNumber], outMs);
            return outMs.ToArray();
        }
    }
}

output.zip (4.4 KB)

Would you kindly share your complete environment details like OS Name and Version, NET Framework, Installed RAM Size, etc.? If possible, please share a sample console application in .zip format that we can use to replicate the issue that you are facing at your end.

Please confirm that when you say above, do you mean generating CSV or Excel file? Just like you are already trying to achieve?

First off, THANK YOU! I really appreciate your help.

Unfortunately, the first thing I did was look at the output. The excel file and the grids on the pdf file are not aligned at all. I would not be able to scroll through each grid with this output.

With your Aspose.pdf api I can read each table and scroll through each row.

Is what I am looking for possible? Can your api simply take this flattened page and get me tables like pdf api does? Would it be possible to make the flattened file into a not-flattened pdf file?

Would the ocr.RecognitionResult object + result.CsvToTable() method do a better job?

Is there an AI that would do this?

I am asking for guidance as to how to get flattened table data in a usable way.

Dim ocr As New Aspose.OCR.AsposeOcr()
Dim result As Aspose.OCR.RecognitionResult = ocr.Recognize(“table.png”)

’ Convert the CSV representation of the recognized table into structured form:
Dim table As List(Of List(Of String)) = result.CsvToTable()
Dim rows As List(Of String()) = result.CsvToRows()

@maseyo

Thanks for further elaboration of your requirements. We need to investigate the feasibility at our end from Aspose.OCR perspective and for that, we will be registering this case in our issue management system. Would you kindly share a sample expected output PDF for our reference so that it can be logged with the ticket as well. We will further proceed accordingly.

Yes I’d like an object in memory, a list of tables, with rows and cells, like in the pdf api. That would be great. We could turn that into xml, json, xsxl, csv, we could anywhere from there.

@maseyo

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): OCRNET-1122

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

I dont need paid support. Your product is supposed to use to read flattened pdf files and get the tables out of them. I need that. Just regular support. I need to know I can do this before purchasing the product. Thanks, :slight_smile:

I have ocr version 25.9.1.0

ocer.DetectAreasMode does not exist: .DetectAreasMode = OCR.DetectAreasMode.TABLE
OCR.InputType does not exist: Dim input As New OCR.OcrInput(OCR.InputType.SingleImage)

ocr.SaveFormat does not exist: result.Save(xlsxStream, ocr.SaveFormat.Xlsx) and
Dim csvOpts As New TxtSaveOptions(SaveFormat.CSV) With {

Can you please let me know the workarounds and see if you can get that to work in C# so I can get it to work in vb.net…thanks.

Private Shared Sub PerformOCROnPDFTable(dataDir As String)
’ 1) Render the page to PNG bytes
Dim pngBytes As Byte() = RenderPageToPng(Path.Combine(dataDir, “Grace_Lutheran.pdf”), 1, 300)

    ' 2) Build OCR input from the PNG stream
    Dim ocr = New OCR.AsposeOcr()
    Dim settings As New OCR.RecognitionSettings() With {
        .DetectAreasMode = OCR.DetectAreasMode.TABLE
    }

    Using ms As New MemoryStream(pngBytes)
        Dim input As New OCR.OcrInput(OCR.InputType.SingleImage)
        ms.Position = 0
        input.Add(ms)

        ' 3) Run OCR (returns List(Of RecognitionResult)); take the first result
        Dim results As System.Collections.Generic.List(Of OCR.RecognitionResult) = ocr.Recognize(input, settings)
        If results Is Nothing OrElse results.Count = 0 Then
            Throw New InvalidOperationException("OCR returned no results.")
        End If

        Dim result As OCR.RecognitionResult = results(0)

        ' 4) Save OCR result to XLSX in-memory
        Using xlsxStream As New MemoryStream()
            result.Save(xlsxStream, OCR.SaveFormat.Xlsx)
            xlsxStream.Position = 0

            ' 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
            Dim wb As New Workbook(xlsxStream)
            Dim csvOpts As New TxtSaveOptions(SaveFormat.CSV) With {
                .Separator = ","c,
                .Encoding = Encoding.UTF8,
                .AlwaysQuoted = True
            }

            ' Save CSV to disk (e.g., App_Data\output.csv)
            Dim outCsvPath As String = Path.Combine(dataDir, "output.csv")
            wb.Save(outCsvPath, csvOpts)
        End Using
    End Using
End Sub

Private Shared Function RenderPageToPng(pdfPath As String, pageNumber As Integer, dpi As Integer) As Byte()
    Using doc As New Document(pdfPath)
        If pageNumber < 1 OrElse pageNumber > doc.Pages.Count Then
            Throw New ArgumentOutOfRangeException(NameOf(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.")
        End If

        Dim res As New Resolution(dpi)
        Dim device As New PngDevice(res)

        Using outMs As New MemoryStream()
            device.Process(doc.Pages(pageNumber), outMs)
            Return outMs.ToArray()
        End Using
    End Using
End Function

Ok I got it to compile, but it still hangs on the Recognize line

 Dim results As System.Collections.Generic.List(Of OCR.RecognitionResult) = ocr.Recognize(input, settings)

Here is the .net code. This is a vb project…It compiles but hangs.

Samr file, same eveything.

Public Shared Sub PerformOCROnPDFTable(ByRef dataDir As String, ByRef FIleName As String)
’ 1) Render the page to PNG bytes
Dim pngBytes As Byte() = RenderPageToPng(Path.Combine(dataDir, FIleName), 1, 300)

 ' 2) Build OCR input from the PNG stream
 Dim ocr = New OCR.AsposeOcr()
 Dim settings As New OCR.RecognitionSettings() With {
     .DetectAreasMode = .DetectAreasMode.TABLE
 }

 Using ms As New MemoryStream(pngBytes)
     Dim input As New OCR.OcrInput(Aspose.OCR.InputType.SingleImage)
     ms.Position = 0
     input.Add(ms)

     ' 3) Run OCR (returns List(Of RecognitionResult)); take the first result
     Dim results As System.Collections.Generic.List(Of OCR.RecognitionResult) = ocr.Recognize(input, settings)
     If results Is Nothing OrElse results.Count = 0 Then
         Throw New InvalidOperationException("OCR returned no results.")
     End If

     Dim result As OCR.RecognitionResult = results(0)

     ' 4) Save OCR result to XLSX in-memory
     Using xlsxStream As New MemoryStream()
         result.Save(xlsxStream, Aspose.OCR.SaveFormat.Xlsx)
         xlsxStream.Position = 0

         ' 5) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
         Dim wb As New Workbook(xlsxStream)
         Dim csvOpts As New Aspose.Cells.TxtSaveOptions(Aspose.Cells.SaveFormat.Csv) With {
             .Separator = ","c,
             .Encoding = Encoding.UTF8,
             .AlwaysQuoted = True
         }

         ' Save CSV to disk (e.g., App_Data\output.csv)
         Dim outCsvPath As String = Path.Combine(dataDir, "output.csv")
         wb.Save(outCsvPath, csvOpts)
     End Using
 End Using

End Sub

@maseyo

We apologize for the confusion. We are not asking you to get paid support :slight_smile:. The generic message was automatically posted when a ticket was attached to this forum thread. We have already observed your requirements and noticed that API is unable to produce expected results. That is why we logged a ticket in our internal issue tracking system to analyze the scenario in details.

We will look into the details of the ticket on first come first serve basis and as soon as we have some updates regarding its resolution, we will inform you via this forum thread. Please be patient and spare us some time.

We are sorry for the inconvenience.

PS: We were not able to replicate the issue where program hangs. Can you please share a sample console application in .zip format with us that we can use to reproduce the issue?

I will Make you a smaple concole app. I just wanted to let yuou know I added a c# project to my project and call iot from thr vb project. It sitll hangs in the exact same place. The line that hanfgs i var results = ocr.Recognize(input, settings);

I tired using a memorystream and a png file that actually exists. This is quite puzzling. I am using .net version 4.7.2, required, Windows 11 Home version 10.0.26200

Ok I will make a console app in vb now.

using Aspose.Pdf;
using Aspose.Pdf.Devices;
using Aspose.Pdf.Operators;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices.ComTypes;
using System.Text;
using System.Threading.Tasks;
using System.Xml.Linq;
using Cells = Aspose.Cells;
using Ocr = Aspose.OCR;

//using Aspose.OCR;
//using Aspose.Cells;

namespace AsposeCSharp
{
public class OCRAspose
{

  public static void PerformOCROnPDFTable(string dataDir)
    {
        string pdfPath = Path.Combine(dataDir, "Grace_Lutheran.pdf");
        string outPath = Path.Combine(dataDir, "Grace_Lutheran.png");

        int pageNumber = 1;
        int dpi = 300;
        var ocr = new Ocr.AsposeOcr();

        //byte[] pngBytes = RenderPageToPng(pdfPath, pageNumber, dpi);
        RenderPageToPng(pdfPath, outPath, pageNumber, dpi);

        var settings = new Ocr.RecognitionSettings
        {
            DetectAreasMode = Ocr.DetectAreasMode.TABLE
        };

        var input = new Ocr.OcrInput(Ocr.InputType.SingleImage);
        //using (var ms = new MemoryStream(pngBytes))
        //{
        //    ms.Position = 0;
        //    input.Add(ms);
        //}
        input.Add(outPath);

        var results = ocr.Recognize(input, settings);
        var result = results[0];

        // --- Disambiguate SaveFormats explicitly ---
        using (var xlsxStream = new MemoryStream())
        {
            result.Save(xlsxStream, Ocr.SaveFormat.Xlsx);  // ← fully qualified
            xlsxStream.Position = 0;

            var wb = new Cells.Workbook(xlsxStream);
            var csvOpts = new Cells.TxtSaveOptions(Cells.SaveFormat.Csv) // ← fully qualified
            {
                Separator = ',',
                Encoding = Encoding.UTF8,
                QuoteType = Cells.TxtValueQuoteType.Always
            };

            wb.Save(Path.Combine(dataDir, "output.csv"), csvOpts);
        }
    }

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
using (var doc = new Document(pdfPath))
{
if (pageNumber < 1 || pageNumber > doc.Pages.Count)
throw new ArgumentOutOfRangeException(nameof(pageNumber),
$“Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.”);

    var res = new Resolution(dpi);
    var device = new PngDevice(res);

    using (var outMs = new MemoryStream())
    {
        device.Process(doc.Pages[pageNumber], outMs);
        return outMs.ToArray();
    }
}

}

    static void RenderPageToPng(string pdfPath, string outputPngPath, int pageNumber, int dpi)
    {
        using (Document pdfDocument = new Document(pdfPath))
        {
            if (pageNumber < 1 || pageNumber > pdfDocument.Pages.Count)
                throw new ArgumentOutOfRangeException(nameof(pageNumber),
                    $"Page {pageNumber} is out of range. Document has {pdfDocument.Pages.Count} pages.");

            Resolution resolution = new Resolution(dpi);
            PngDevice pngDevice = new PngDevice(resolution);

            using (FileStream imageStream = new FileStream(outputPngPath, FileMode.Create))
            {
                pngDevice.Process(pdfDocument.Pages[pageNumber], imageStream);
            }
        }
    }




}

}

@maseyo

Have you tried executing the application in Release mode instead of Debug? We believe it will require detailed debugging if OCR related code is in another project and being called through reference. It will be quite helpful for us if we are provided with a sample Console application (minimal) to reproduce the same issue in our environment. This way we will be able to investigate it accordingly and address it for you. We sincerely apologize for the inconvenience you have been facing.

PS: Have you been using a 30-days free temporary license for API evaluation?

Here is the c# code in another project I added to my vb project. Please feel free to run it however you need to.

I am working in the console app now in vb.net. I’ll send it along as soon as possible.

using Aspose.Pdf;
using Aspose.Pdf.Devices;
using Aspose.Pdf.Operators;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices.ComTypes;
using System.Text;
using System.Threading.Tasks;
using System.Xml.Linq;
using Cells = Aspose.Cells;
using Ocr = Aspose.OCR;

//using Aspose.OCR;
//using Aspose.Cells;

namespace AsposeCSharp
{
    public class OCRAspose
    {

      public static void PerformOCROnPDFTable(string dataDir)
        {
            string pdfPath = Path.Combine(dataDir, "Grace_Lutheran.pdf");
            string outPath = Path.Combine(dataDir, "Grace_Lutheran.png");

            int pageNumber = 1;
            int dpi = 300;
            var ocr = new Ocr.AsposeOcr();

            //byte[] pngBytes = RenderPageToPng(pdfPath, pageNumber, dpi);
            RenderPageToPng(pdfPath, outPath, pageNumber, dpi);

            var settings = new Ocr.RecognitionSettings
            {
                DetectAreasMode = Ocr.DetectAreasMode.TABLE
            };

            var input = new Ocr.OcrInput(Ocr.InputType.SingleImage);
            //using (var ms = new MemoryStream(pngBytes))
            //{
            //    ms.Position = 0;
            //    input.Add(ms);
            //}
            input.Add(outPath);

            var results = ocr.Recognize(input, settings);
            var result = results[0];

            // --- Disambiguate SaveFormats explicitly ---
            using (var xlsxStream = new MemoryStream())
            {
                result.Save(xlsxStream, Ocr.SaveFormat.Xlsx);  // ← fully qualified
                xlsxStream.Position = 0;

                var wb = new Cells.Workbook(xlsxStream);
                var csvOpts = new Cells.TxtSaveOptions(Cells.SaveFormat.Csv) // ← fully qualified
                {
                    Separator = ',',
                    Encoding = Encoding.UTF8,
                    QuoteType = Cells.TxtValueQuoteType.Always
                };

                wb.Save(Path.Combine(dataDir, "output.csv"), csvOpts);
            }
        }
        
       

private static byte[] RenderPageToPng(string pdfPath, int pageNumber, int dpi)
{
    using (var doc = new Document(pdfPath))
    {
        if (pageNumber < 1 || pageNumber > doc.Pages.Count)
            throw new ArgumentOutOfRangeException(nameof(pageNumber),
                $"Page {pageNumber} is out of range. Document has {doc.Pages.Count} pages.");

        var res = new Resolution(dpi);
        var device = new PngDevice(res);

        using (var outMs = new MemoryStream())
        {
            device.Process(doc.Pages[pageNumber], outMs);
            return outMs.ToArray();
        }
    }
}

        static void RenderPageToPng(string pdfPath, string outputPngPath, int pageNumber, int dpi)
        {
            using (Document pdfDocument = new Document(pdfPath))
            {
                if (pageNumber < 1 || pageNumber > pdfDocument.Pages.Count)
                    throw new ArgumentOutOfRangeException(nameof(pageNumber),
                        $"Page {pageNumber} is out of range. Document has {pdfDocument.Pages.Count} pages.");

                Resolution resolution = new Resolution(dpi);
                PngDevice pngDevice = new PngDevice(resolution);

                using (FileStream imageStream = new FileStream(outputPngPath, FileMode.Create))
                {
                    pngDevice.Process(pdfDocument.Pages[pageNumber], imageStream);
                }
            }
        }
    }
}

@maseyo

Thanks for sharing details. We are working on provided details to test the case. In the meanwhile, please do share sample console application when it is ready.

Here is the vb.net console app. Exact same code this runs. It takes a minute to run though. Also, the output file has some error code seems to be a licensing issue? No error is thrown. If you look in the pdf directory you will see 2 csv files both are identical and contain this error and not the output you got with data in it.

AsposeOCR.zip (639.5 KB)

Microsoft Visual Studio Community 2022 (64-bit) - Current
Version 17.14.17

.net 4.7.2

The ocr license has expired, may I please have an extension like I got with the pdf license. As with the pdf api, if the ocr will work for us we will purchase a license. Thank you.

@maseyo

Yes, this seems like licensing issue.

Yes, you can please post your request in our Purchase forum to get an extension for trial license.

Furthermore, about the earlier logged ticket - please check below code sample:

// 1) Build OCR input from the PDF path
var ocr = new AsposeOcr();
var settings = new RecognitionSettings
{
    DetectAreasMode = DetectAreasMode.TABLE
};

var input = new OcrInput(InputType.PDF);
input.Add(dataDir + "Grace_Lutheran.pdf");

// 2) Run OCR (returns List<RecognitionResult>); take the first result
List<RecognitionResult> results = ocr.Recognize(input, settings);
if (results == null || results.Count == 0)
    throw new InvalidOperationException("OCR returned no results.");

RecognitionResult result = results[0];

// 3) Save OCR result to XLSX in-memory
using (var xlsxStream = new MemoryStream())
{
    result.Save(xlsxStream, SaveFormat.Xlsx);
    xlsxStream.Position = 0;

    // 4) Use Aspose.Cells to convert XLSX -> CSV (UTF-8, quoted)
    var wb = new Aspose.Cells.Workbook(xlsxStream);
    var csvOpts = new Aspose.Cells.TxtSaveOptions(Aspose.Cells.SaveFormat.Csv)
    {
        Separator = ',',
        Encoding = Encoding.UTF8,
        AlwaysQuoted = true
    };

    using (var csvStream = new MemoryStream())
    {
        wb.Save(dataDir + "output1.csv", csvOpts);
    }
}

Aspose.OCR has functionality to recognize PDF files without conversion to PNG
and about csv - we will add this format in the OCR.SaveFormat . It will be available in the release 25.11.0.

Also, we will add a task for the RecognitionResult class to return a structure with cells (columns and rows containing coordinates and text). We only need your confirmation will this kind of result help you in achieving what you actually require?

THANK YOU AGAIN!

Being able to not make a png first is awesome!!

Second, if you are familiar with how the Aspose.pdf api works, you return an absorber object which absorbs a page, like recognize does. In the absorber, since I am looking for tables it returns a TableList of all tables it found on a page, each table has a rowlist and a celllist.

If you could keep that paradigm, one you already have and return tables as if they were absorbed, that would be great for everyone, as they already know what that is.

Heck, I’d even name the function “absorb”

Here is my function to return an absorbed table

Function getAbsorber(ByVal iPage As Aspose.Pdf.Page) As Aspose.Pdf.Text.TableAbsorber
Dim iReturn As Aspose.Pdf.Text.TableAbsorber

Try
    iReturn = New Aspose.Pdf.Text.TableAbsorber
    iReturn.Visit(iPage)

Catch ex As Exception
    Console.WriteLine("getAbsorber Error: " & ex.Message)
End Try


Return iReturn

End Function

The tables are in the Aspose.Pdf.Text.TableAbsorber object. Tables have to be a widely needed feature, no?

When is 25.11.0 available?