We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Extract a table is missing to return cell values

we are trying to extract a table using the below code and not able to extract all the values from table cells.
example_011.pdf (26.8 KB)

@Test
public void extractTableTest() throws Exception {
PdfTextExtract extact = null;
try {
File srcFile = getFileFromResource(“Alcon_FortWorth.pdf”);
FileInputStream fis = new FileInputStream(srcFile);
Document pdfDocument = new Document(fis);
TableAbsorber ta = new TableAbsorber();
PageCollection pages = pdfDocument.getPages();
System.out.println(pages.size() + " Pages.");
for (int i = 1; i <= pages.size(); i++) {
System.out.println(" Page " + i);
ta.visit(pdfDocument.getPages().get_Item(i));
IGenericList tableList = ta.getTableList();

            for (AbsorbedTable at : tableList) {
                System.out.println("Table: " + at.getRectangle().getLLY() + ", " + at.getRectangle().getURY());
                IGenericList<AbsorbedRow> rowList = at.getRowList();
                for (AbsorbedRow row : rowList) {
                    IGenericList<AbsorbedCell> cellList = row.getCellList();
                    for (AbsorbedCell cell : cellList) {
                        TextFragmentCollection textFragments = cell.getTextFragments();
                        for (TextFragment tf : textFragments) {
                            System.out.println(tf.getText() + "     ");
                        }
                    }
                    System.out.println("\n");
                }
                System.out.println("\n");
            }
        }
    } finally {
        if (extact != null) {
            extact.close();
        }
    }
}

@danros,

Please note, TableAbsorber class cannot retrieve a table without the complete borders. Furthermore, we managed to replicate the problem of not being able to retrieve all cell values. It has been logged under the ticket ID PDFJAVA-37353 in our bug tracking system. We have linked your post to this ticket and will keep you informed regarding any available updates.

The issues you have found earlier (filed as PDFJAVA-37353) have been fixed in Aspose.PDF for Java 21.10.