I need to extract some table data from PDF documents and to do this I tried to use Aspose.PDF 20.9 for .NET with a temporary license.
Specifically, I tried the “TableAbsorber” object and the first problem I’m having is that some information seems not to be intercepted: the right column of the first table on page 1 and the contents of the first row of the second table (only number 1 is intercepted).
PDF_FirstTable_Page _1.JPG (9.3 KB)
Extracted_Data_FirstTable_Page_1.JPG (4.8 KB)
PDF_SecondTable_Page5.JPG (18.7 KB)
Extracted_Data_SecondTable_Page_5.JPG (11.1 KB)
Also I wanted to know if there is a way to extract these tables in DataTable format; this would help a lot in parsing operations.
I have prepared a Windows Form Application containing the logic I am using and 3 PDF files that are giving me the same result (the PDF documents are in the “bin \ InputFiles” folder)
AsposeExtractTables.zip (194.6 KB)
Could you kindly help me in understanding how to handle these problems encountered?
Thank you in advance