Merge Cells information is wrong

Hi,

We now have a problem.
We want to get the table information with a word, and we found it can not get the correct information when the table has a horizontal merged cells.
The following is our demo and the attachment is our word:

public static void main(String[] args) {
Document doc = null;
try {
doc = new Document(“test.docx”);
}
catch (Exception e) {
e.printStackTrace();
}

SectionCollection sections = doc.getSections();
for (int i = 0; i < sections.getCount(); i++) {
Section section = sections.get(i);
NodeCollection<?> nodes = section.getBody().getChildNodes();
convertParsTables(doc, nodes);
}
}

private static void convertParsTables(Document doc, NodeCollection<?> nodes) {
int count = nodes.getCount();
for (int i = 0; i < count; i++) {
Node node = nodes.get(i);
if (node instanceof Table) {
Table table = (Table) node;
convertTable(doc, table);
}
}
}

private static void convertTable(Document doc, Table table) {
RowCollection rows = table.getRows();
for (int r = 0; r < rows.getCount(); r++) {
Row row = rows.get®;
convertRow(doc, rows, row, r);
}
}

private static void convertRow(Document doc, RowCollection rows, Row row, int r) {
CellCollection cells = row.getCells();
for (int c = 0; c < cells.getCount(); c++) {
Cell cell = cells.get©;
convertCell(doc, rows, cells, r, c, cell);
}
}

private static void convertCell(Document doc, RowCollection rows, CellCollection cells, int r, int c, Cell cell) {
CellFormat cellformat = cell.getCellFormat();
int hmerge = cellformat.getHorizontalMerge();
int vmerge = cellformat.getVerticalMerge();
System.out.println(Integer.toString(hmerge));
System.out.println(Integer.toString(vmerge));
}

The red code can not get the correct information, result of this code is always 0.

Hi Yu,


Thanks for your inquiry.

If you use the code from this article to print the horizontal and vertical merge types of cells in your table, you’ll find a few occurrences where it says that the cells are vertically merged.

Regarding Horizontally merged cells, please note that By Microsoft Word design, rows in a table in a Microsoft Word document are completely independent. It means each row can have any number of cells of any width. So if you imagine first row with one wide cell and second row with two narrow cells, then looking at this document the cell in the first row will appear horizontally merged. But it is not a merged cell; it is just a single wide cell. Another perfectly valid scenario is when the first row has two cells. First cell has CellMerge.First and second cell has CellMerge.Previous, in this case it is a merged cell. In both cases, the visual appearance in Microsoft Word is exactly the same. Both cases are valid and Microsoft Word treats merged Cells as one wide Cell.

If we can help you with anything else, please feel free to ask.

Best regards,

Hi Yu,

We now want to change word to html with our code , but we have a problem when dealing with the merged cell.In our html page , the merged cell is shown incorrectly.
Following is our demo :
1.png is a table in our word document , and 2.png is the result showed in our web page.
We also use the method your support :
document.save(OutputStream Stream, int saveFormat) , the table can be show in our webpage correctly.
So,can you tell me how to deal with the merged cells when changing word to html?

Hi Yu,

Thanks for your inquiry. Have you tried latest version of Aspose.Words for Java i.e. 15.8.0?
http://www.aspose.com/community/files/72/java-components/aspose.words-for-java/default.aspx

Are you using the following simple code to convert Word to HTML?

Document doc = new Document("input.docx");
doc.save("15.8.0.html");

In case the problem still remains, please attach your input Word document and corresponding output HTML file showing the undesired behavior here for testing. We will investigate the issue on our end and provide you more information.

PS: To attach these resources, please zip them and Click 'Reply' button that will bring you to the 'reply page' and there at the bottom you can include any attachments with that post by clicking the 'Add/Update' button.

Best regards,

Hi Yu,

We have used the simple code you say to convert to HTML , and it’s no wrong to deal with this issue.

But for some reason we had to use our own methods to convert word to HTML , and can you tell us how to handle this situation when dealing with the merged cells.

Hi Yu,


Thanks for your inquiry. Please refer to my previous post here.

Secondly, please create a standalone runnable simple Java application (source coded without compilation errors) that helps us reproduce your problem on our end and attach it here for testing. As soon as you get this simple application ready, we’ll start investigation into your issue and provide you more information.

Best regards,

Hi Yu,

So sorry to reply you after so many days.

Now , I have a simple demo to show my problem.

test.doc is a doument with a table in it , test.html is the result after converting test.doc to html and ConvertWordToHtml.java is the demo I use to convert doc to html.
You can find that I can’t get the correct table with losing the infomation of merged cells.And I must use myself method to convert doc to html , can you tell me how to deal with merged cells?

Hi Yu,


Thanks for the additional information. Using your code with Aspose.Words for Java 15.9.0, I managed to generate HTML which I have attached to this post for your reference (see out-15.9.0.html). Please attach your ‘expected HTML’ here for our reference. We will investigate the structure of your expected document as to how you want your final output be generated like and provide you code to achieve the same using Aspose.Words.

Best regards,

Hi Yu,


I have write my ‘expected HTML’ which I have attached to this post for your reference(see reply.html).And can you help me to get the expected HTML using Aspose.Words.

Hi Yu,


Thanks for your inquiry. If you use the code from this article to print the horizontal and vertical merge types of cells in your table, you’ll find no occurrences where it says that the cells are vertically or horizontally merged. Please refer to my this post for reasoning. You can also observe that the width of last Cell in first Row is twice than other Cells.
Document doc = new Document(getMyDir() + “Word2Html\test.doc”);
Table tab = doc.getFirstSection().getBody().getTables().get(0);
for (Row row : tab.getRows()) {
for (Cell cell : row.getCells()) {
System.out.println(cell.getCellFormat().getWidth());
}
}
So, this is expected behavior. If we can help you with anything else, please feel free to ask.

Best regards,