Get tables from each page in PDF and create single table in one sheet

Hi,

I’m trying Aspose.PDF for reading PDF file and Aspose.Cells for creating worksheets from each. I have attached the sample project I’m trying to make it work.

UpwPdfToExcel.zip (9.5 KB)

I’m trying to convert the tables in the attached PDF to desired Excel file. I’m lost at getting selective tables in the sheet to produce the excel file.

PriceBook.pdf (711.3 KB)

The target excel file can be accessed from https://docs.google.com/spreadsheets/d/1a9Uwx2W0wVl6YhmJfheJ6YnpjSQd5Aus/edit?usp=sharing&ouid=114931800037126868341&rtpof=true&sd=true.

Can you please help me?

@vssaini
Do you mean that you want to gather all tables in the pdf to one sheet ?
If they are in one sheet, do you need split those tables with some empty rows?

Yes! Kind of. If you will notice records in excel they are being pulled from each table and combined.

@vssaini
Please check the attached codes.AsposeService_test.zip (2.3 KB)
If you want to gather to one worksheet, just simply import all data to the first worksheet.
And you have to check the data in the first row of each table to find whether they are your need as Line 82 in .cs.

@vssaini
It seems that Aspose.Pdf does not identify the table since Page 5,so they are not imported.

@simon.zhao First of all, thanks a lot for taking the time for helping me out.

If I run my code, it is able to read Pages 5, 6, and 7 too. But I’m not sure why it’s missing the table where the column begins with “Roll Size”.

See the excel file generated by my code - https://docs.google.com/spreadsheets/d/1K-i_SjkXtnp2r9b_UCAdWnK3ZrHca7B0/edit?usp=sharing&ouid=114931800037126868341&rtpof=true&sd=true

If I use SautinSoft to create excel file, then I get this one - https://docs.google.com/spreadsheets/d/1Jao_lxv4BSJujGQlfbglc1MvGRobIEHi/edit?usp=sharing&ouid=114931800037126868341&rtpof=true&sd=true. Can you please help me to read this excel file in a single file?

@vssaini

Can you please explain how it is missing table when the column begins with “Roll Size”. Can you please share the minimal code snippet to replicate this behavior in our environment?

@asad.ali I already shared the project zip file in question.

When you run this project, the sheet named Page5 in the resultant excel file will show you what I meant to share.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-53607

You can obtain Paid Support services if you need support on a priority basis, along with the direct access to our Paid Support management team.