How to know the type of content of a cell

Dear,

We would like to use Aspose.Word product to extract the content of the cells of a Tables from DOCX document. Is there any way to know the type of content of a cell, to proceed to extract it as an image, text or formula?

I show an example I am trying to do:

String foldername = "applications/classification/smoketest";
String filename = "Description-Large-Tables.docx";

Document document = TestUtils.loadAsposeDocxDocument(foldername, filename);
List<com.aspose.words.Table> officeTableList = AsposeDocumentUtils.getTables(document);
List<org.epo.filing.docxconversion.st36.content.Table> tableObjectList = new ArrayList<>();

for (com.aspose.words.Table officeTable : officeTableList) {
        tableObjectList.add((org.epo.filing.docxconversion.st36.content.Table) 
       renderOfficeTableToST36ElementAsObject(officeTable));
}


/**
 * Creating a ST36 table and setting the content from the Aspose table.
 */
public Object renderOfficeTableToST36ElementAsObject(final com.aspose.words.Table officeTable) {
try {
    org.epo.filing.docxconversion.st36.content.Table table = new org.epo.filing.docxconversion.st36.content.Table();

    //Separation of rows and columns //spacing or width?
    double spacing = officeTable.getCellSpacing();
    //double width = officeTable.getPreferredWidth().getValue();
    table.setColsep(String.valueOf(spacing));
    table.setRowsep(String.valueOf(spacing));

    //Group of columns
    Tgroup tgroup1 = new Tgroup();

    //Body with rows
    Tbody tbody = new Tbody();
    int rowNumber = officeTable.getRows().getCount();
    int columnNumber = 0;

    // Iterate through all rows in the table
    for (int i = 0; i < rowNumber; i++) {
        Row row = officeTable.getRows().get(i);
        org.epo.filing.docxconversion.st36.content.Row tbodyRow = new org.epo.filing.docxconversion.st36.content.Row();
        int cellNumber = row.getCells().getCount();



        //Check if the number of cells is the same as the number of cells in the other rows,
        // if it is larger, add the corresponding columns to the new table
        if (cellNumber > columnNumber) {
            //Create a column for each cell
            for (int j = columnNumber; j < cellNumber; j++) {
                Cell cell = row.getCells().get(j);

                Colspec colspec = new Colspec();
                colspec.setColname("C0" + j);
                colspec.setColnum("" + j);
                colspec.setColwidth(Double.toString(cell.getCellFormat().getWidth()));

                tgroup1.getColspec().add(colspec);
            }
            columnNumber = cellNumber;
        }

        // Iterate through all cells in the row
        for (int j = 0; j < cellNumber; j++) {
            Cell cell = row.getCells().get(j);
            int cellNodeType = cell.getNodeType();  //Allways return 7

            String texto = cell.getText();
            String cellText = cell.toString(SaveFormat.TEXT).trim();
            Entry entry = new Entry();
            entry.setNamest("C0" + j);
            entry.getContent().add(cellText);

            tbodyRow.getEntry().add(entry);
        }
        tbody.getRow().add(tbodyRow);
    }
    tgroup1.setCols(Integer.toString(columnNumber));
    tgroup1.setTbody(tbody);
    table.getTgroup().add(tgroup1);

    return table;
} catch (Exception e) {
    LOGGER.error(TABLE_ML_CONVERSION_EXCEPTION, e);
    return null;
}
}
}

A lot of thanks!

@josgobo

Thanks for your inquiry. You can use CompositeNode.ChildNodes property to get all immediate child nodes of this node. Please use Node.NodeType property to get the type of node. All text of the document is stored in runs of text. The image is imported as Shape node into Aspose.Words’ DOM.

If you face any issue, please ZIP and attach your input and expected output documents here for our reference. We will then provide you more information about your query along with code.

Thank you very much for your reply. I have extracted a table from a word document in docx format using aspose. Then I go through all the contents of this table to create a new table of another type of object. With the text I am doing well, but now I need to know the type of data that contains the cell to try to insert it in my new type of table as appropriate. I edit the code of the first message and add the whole process that I’m trying.

@josgobo

Thanks for sharing the detail. We suggest you please read the following article.
Aspose.Words Document Object Model

You can use Node.NodeType property to know the type of content in the cell.

A lot of thanks!