Read Table is wrong

Hi, when read pdf, table is not correct row & column. This is my code:
filetest.pdf (212.7 KB)

    public string ReadPdfWithApose(string Filename)
            {
                string kq = "";
                Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(Filename);
                TableAbsorber absorber = new TableAbsorber();
                absorber.Visit(pdfDocument.Pages[1]);
                foreach (AbsorbedTable table in absorber.TableList)
                {
                    foreach (AbsorbedRow row in table.RowList)
                    {
                        foreach (AbsorbedCell cell in row.CellList)
                        {
                            foreach (TextFragment text in cell.TextFragments)
                            {
                                kq += text.Text + " ";
                            }
                            kq += "|";
                        }
                        kq += "<br />-------------------------------------------";
                    }
                    kq += "<br />===========================================";
                }
                return kq;
            }

image.png (8.3 KB)

please help to read correct. Thanks !

@luanle

Thank you for contacting support.

We have noticed the output as in this screenshot so would you please elaborate about the problem while comparing it with source document. Moreover, please ensure using Aspose.PDF for .NET 19.6 in your environment.

Thank you for your reply.
I’m using Aspose.PDF for .Net 19.6 image.png (6.8 KB)

as in my code above.
kq += "|"; is end one cell,
kq += "<br />-------------------------------------------"; is end one row

and result as image.png (8.4 KB)
Expected result:image.png (5.1 KB)

My code with Asp.net. Please suggest to me to use this dll convert table data above to DataTable

@luanle

Thank you for the details.

The contents in screenshot which we have shared with you earlier, and in your expected output appear same. Kindly elaborate the differences with screenshots so that we may investigate further.

Hi,
I expected output there are vertical line between each cell. Or read table and return DataTable
image.png (21.6 KB)

@luanle

Thank you for elaborating it further.

Would you please try below code snippet and then share your kind feedback with us. We have also attached the image.png for your kind reference.

public string ReadPdfWithApose(string Filename)
{
    string kq = "";
    Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(Filename);
    TableAbsorber absorber = new TableAbsorber();
    absorber.Visit(pdfDocument.Pages[1]);
    foreach (AbsorbedTable table in absorber.TableList)
    {
        foreach (AbsorbedRow row in table.RowList)
        {
            foreach (AbsorbedCell cell in row.CellList)
            {
                foreach (TextFragment text in cell.TextFragments)
                {
                    //kq += text.Text + " ";
                    kq += text.Text + " | ";
                }
            }
            //kq += "<br />-------------------------------------------";
            kq += "\n";
        }
        //kq += "<br />===========================================";
        kq += "\n";
    }
    return kq;
}

Thanks you for your reply.
It’s difficult to match it to a cell. as your image:image.png (23.9 KB)

| jokermann723@aol.c | om | =>need | jokermann723@aol.com |
or | jeffrey Zitzelberger | 3892 W 1450 N, | West Point, | UT-84015-7314 | => need | jeffrey Zitzelberger 3892 W 1450 N, West Point, UT-84015-7314 |

Do you have any idea to read it to DataTable ? @Farhan.Raza

@luanle

Thank you for clarifying the differences.

We have logged a ticket with ID PDFNET-46577 in our issue management system for further investigations. Moreover, an efficient data table may not be created because some text is extracted as more than one TextFragment in one cell. We will let you know as soon as some significant updates will be available about this ticket.

Hi,
Thanks you so much and appreciate for your deeply help @Farhan.Raza

The issues you have found earlier (filed as PDFNET-46577) have been fixed in Aspose.PDF for .NET 23.8.