Merged Table Can't Read a Single Table Via Aspose.pdf TableAbsorber

when Reading Table using TableAbsorber in Aspose.pdf. facing the issue reading when table have Merged cells it give it an Another Table
I have an OneTable It showed an More than One table I Attached the single Table PDF
Aspose merged table red as multiple table.pdf (281.3 KB)

@vijayanathan,

Since you already shared the PDF, can you share your code snippet please?

I tried this Code using Aspose.pdf

public static void Extract_Table()
{
    // Load source PDF document
    Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(@"c:\tmp\the_worlds_cities_in_2018_data_booklet 7.pdf");           
    foreach (var page in pdfDocument.Pages)
    {
        Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
        absorber.Visit(page);
        foreach (AbsorbedTable table in absorber.TableList)
        {
            foreach (AbsorbedRow row in table.RowList)
            {
                foreach (AbsorbedCell cell in row.CellList)
                {
                    TextFragment textfragment = new TextFragment();
                    TextFragmentCollection textFragmentCollection = cell.TextFragments;
                    foreach (TextFragment fragment in textFragmentCollection)
                    {
                        string txt = "";
                        foreach (TextSegment seg in fragment.Segments)
                        {
                            txt += seg.Text;
                        }
                        Console.WriteLine(txt);
                    }
                }
            }
        }
    }
}

@vijayanathan,

I created this sample code:

private void Logic()
{
    Document doc = new Document($"{PartialPath}_input.pdf");

    foreach (var page in doc.Pages)
    {
        Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
        absorber.Visit(page);

        int countTable = 0;
        int countRow = 0;
        int countCell = 0;
        int countSegment = 0;
        foreach (AbsorbedTable table in absorber.TableList)
        {
            countTable++;
            countRow = 0;
            foreach (AbsorbedRow row in table.RowList)
            {
                countRow++;
                countCell = 0;
                foreach (AbsorbedCell cell in row.CellList)
                {
                    TextFragment textfragment = new TextFragment();
                    TextFragmentCollection textFragmentCollection = cell.TextFragments;
                    foreach (TextFragment fragment in textFragmentCollection)
                    {
                        
                        fragment.Text = $"T{countTable};R:{countRow};C:{countCell}";
                        fragment.TextState.FontSize = 12;
                        fragment.TextState.ForegroundColor = Color.Black;

                        //fragment.Segments.Clear();
                        //countSegment = 0;
                        //foreach (TextSegment seg in fragment.Segments)
                        //{
                        //    seg.Text = $"S:{countSegment}";
                        //}
                    }
                }
            }
        }
    }

    doc.Save($"{PartialPath}_output.pdf");
}

The code is simple but it is supposes to help see the tables and it did not worked properly. So I will be creating a bug for the dev team.

@vijayanathan
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-53877

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

HI @carlos.molina ,
Any update on ticket -PDFNET-53877

@thiru1711

Regretfully, there is no update yet about ticket resolution. We will surely inform you via this forum thread as soon as we have some news about its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.