Merged Table Can't Read a Single Table Via Aspose.pdf TableAbsorber

when Reading Table using TableAbsorber in Aspose.pdf. facing the issue reading when table have Merged cells it give it an Another Table
I have an OneTable It showed an More than One table I Attached the single Table PDF
Aspose merged table red as multiple table.pdf (281.3 KB)

@vijayanathan,

Since you already shared the PDF, can you share your code snippet please?

I tried this Code using Aspose.pdf

public static void Extract_Table()
{
    // Load source PDF document
    Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(@"c:\tmp\the_worlds_cities_in_2018_data_booklet 7.pdf");           
    foreach (var page in pdfDocument.Pages)
    {
        Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
        absorber.Visit(page);
        foreach (AbsorbedTable table in absorber.TableList)
        {
            foreach (AbsorbedRow row in table.RowList)
            {
                foreach (AbsorbedCell cell in row.CellList)
                {
                    TextFragment textfragment = new TextFragment();
                    TextFragmentCollection textFragmentCollection = cell.TextFragments;
                    foreach (TextFragment fragment in textFragmentCollection)
                    {
                        string txt = "";
                        foreach (TextSegment seg in fragment.Segments)
                        {
                            txt += seg.Text;
                        }
                        Console.WriteLine(txt);
                    }
                }
            }
        }
    }
}

@vijayanathan,

I created this sample code:

private void Logic()
{
    Document doc = new Document($"{PartialPath}_input.pdf");

    foreach (var page in doc.Pages)
    {
        Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
        absorber.Visit(page);

        int countTable = 0;
        int countRow = 0;
        int countCell = 0;
        int countSegment = 0;
        foreach (AbsorbedTable table in absorber.TableList)
        {
            countTable++;
            countRow = 0;
            foreach (AbsorbedRow row in table.RowList)
            {
                countRow++;
                countCell = 0;
                foreach (AbsorbedCell cell in row.CellList)
                {
                    TextFragment textfragment = new TextFragment();
                    TextFragmentCollection textFragmentCollection = cell.TextFragments;
                    foreach (TextFragment fragment in textFragmentCollection)
                    {
                        
                        fragment.Text = $"T{countTable};R:{countRow};C:{countCell}";
                        fragment.TextState.FontSize = 12;
                        fragment.TextState.ForegroundColor = Color.Black;

                        //fragment.Segments.Clear();
                        //countSegment = 0;
                        //foreach (TextSegment seg in fragment.Segments)
                        //{
                        //    seg.Text = $"S:{countSegment}";
                        //}
                    }
                }
            }
        }
    }

    doc.Save($"{PartialPath}_output.pdf");
}

The code is simple but it is supposes to help see the tables and it did not worked properly. So I will be creating a bug for the dev team.

@vijayanathan
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-53877

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.