Parse table in existing PDF

Hello,

We are trying to parse table from pdf file with TableAbsorber class, but contents are missing in the absorber (columns). Here attached the pdf file and the results of the parsing.

We used the code snippet below :

Document pdfDocument = new Document(dataDir + “Max.pdf”);<o:p></o:p>

TableAbsorber absorber = new TableAbsorber();

absorber.Visit(pdfDocument.Pages[1]);

foreach(AbsorbedTable table in absorber.TableList)

{

foreach(AbsorbedRow row in table.RowList)

{

foreach(AbsorbedCell cell in row.CellList)

{

foreach(TextFragment text in cell.TextFragments)

{

Console.Write(text.Text + " ");

}

Console.Write("|");

}

Console.WriteLine("-------------------------------------------");

}

Console.WriteLine("===========================================");

}

Test.pdf (38.5 KB)
Result.pdf (41.0 KB)

@knjeckil,

Can you please share desired result in form of sample so that we may further investigate to help you out.

Please find attached expected result.
ExpectedResult.pdf (59.4 KB)

@knjeckil,

Thanks for sharing requested file.

I have observed your issue and like to inform that I have created investigation ticket with ID PDFNET-47610 in our issue tracking system to investigate and resolve this issue as soon possible.