Extract the pdf table data using java

i want to read a table from pdf

Hi there,


Thanks for your inquiry. I am afraid currently Aspose.Pdf does not support feature to manipulate existing tables in PDF file. We have already logged a feature request to read/manipulate existing tables in PDF files. We have linked your post to the issue and will notify you as soon as it is implemented.

However, as a workaround you can convert PDF file to excel using following code and use Aspose.Cells to read data from the excel worksheet.

Document doc = new Document(myDir+“ZKB.pdf”);<o:p></o:p>

ExcelSaveOptions options = new ExcelSaveOptions();

doc.save(myDir+"output.xls", options);



Please feel free to contact us for any further assistance.

Best Regards,

Hi Ahmad,

I tried the code which you have given with maven but i was getting error i.e.

Failure to find com.aspose:aspose-pdf:jar:14.5.0 in http://maven.aspose.com/artifactory/simple/ext-release-local/…

Hi there,

Thanks for your feedback. Please note latest version of Aspose.Pdf for Java is 9.3.1. Moreover, Aspose.Pdf for Java 9.3.1 is uploaded as "aspose-pdf-9.3.1-jdk16.jar", so you need to use classifier tag in POM.xml for jdk16 as following. It will fix the issue.

com.aspose
aspose-pdf
9.3.1
jdk16

Please feel free to contact us for any further assistance.

Best Regards,

is there any update regarding pdf table data read?

Hi there,


Thanks for your inquiry. I am afraid Table related issue is still not resolved it is pending for investigation. We will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,
Hello Ahmad,

Does functionality of reading tables from a PDF file is added to ASPOSE?
I also wanted to read tables from pdfs.

Thanks in advance.

Hi Rajesh,


Thanks for your inquiry. I am afraid the subjected feature is still not implemented as most of the PDF documents do not provide some mark for tables. Please share your sample PDF document here, if it is tagged PDF then we can look into it and will try to provide a solution.

Moreover, you may try above suggested workaround to convert PDF to Excel and use Aspose.Cells to read table data.

We are sorry for the inconvenience caused.

Best Regards,

Hi

Is there any update regarding pdf table data read?

Thx

Hi Rajesh,


Thanks for your patience.

The development team is still working on implementing this feature and I am afraid due to its complexity, its not yet implemented. Please note that Aspose.Pdf for Java is an auto-ported version of its .NET sibling and first we need to implement the feature in Aspose.Pdf for .NET and then same feature will be ported to Aspose.Pdf for Java.

Your patience and comprehension is greatly appreciated in this regard.

Hi Rajesh,


Thanks for your patience.

We are pleased to share that the issue reported earlier is resolved and its fix will be included in upcoming release of Aspose.Pdf for Java 10.6.0. In order to generate correct output, please try using following code snippet.

[Java]

Document pdfDocument = new Document(myDir + “table.pdf”);<o:p></o:p>

// Create TableAbsorber object to find tables

TableAbsorber absorber = new TableAbsorber();

// Visit first page with absorber

absorber.visit(pdfDocument.getPages().get_Item(1));

// Get access to first table on page, their first cell and text fragments in it

TextFragment fragment = absorber.getTableList().get_Item(0)

.getRowList().get_Item(0)

.getCellList().get_Item(0)

.getTextFragments().get_Item(1);

// Change text of the first text fragment in the cell

fragment.setText("hi world");

pdfDocument.save(myDir + "out_table_1060.pdf");

The issues you have found earlier (filed as PDFNEWJAVA-33729) have been fixed in Aspose.Pdf for Java 10.6.0 .


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.
(1)