Extra border appears after update of Rich Text SDT using builder.insertHtml() api

Hello team.
We encontered an issue when updating existing Word document containing one Rich Text embedded control
that has a table with inserted HTML.
When the given embedded content is updated as following:

  1. richText.removeAllChildren() (where richText is the given SDT)
  2. add the paragraph to the richText control
Paragraph para = new Paragraph(targetDoc);
richText.appendChild(para);
  1. insert given html
builder.moveTo(para);
builder.insertHtml(html, false);
  1. remove the previously added paragraph
builder.moveTo(para);
builder.getCurrentParagraph().remove();

there appears an extra border on the left and top of the embedded control.

Attached is the archive ExtraBorderInSDTEmbeddedContent.7z containing the following:

  • DocWithContentControlWithBorder.docx.docx - input word document containing one embedded content with the tag ‘0001’;
  • DocWithContentControlWithBorder_ECupdated - output word document after embedded content is updated with Aspose.Words version 25.1;
  • ec_added.html - input HTML table to be inserted into Word;
  • WordContainerManagerTest.java - sample test program illustrating the issue;
  • WordContainerManager.java - small program that updates the given existing content control;
  • AsposeWordUtility.java - helper utility class used by WordContainerManager

This issue occurs under both Linux 7 and Windows 11 OS.
Please let us know if you need additional information.
Thank you.
ExtraBorderInSDTEmbeddedContent.7z (76.9 KB)

@oraspose

Cause

The issue of an extra border appearing around the Rich Text Structured Document Tag (SDT) after updating it with HTML content may stem from how the content is inserted and the subsequent removal of the paragraph. This can lead to unintended formatting changes, particularly with borders.

Solution

To resolve this issue, you can try the following steps:

  1. Avoid Removing the Paragraph Immediately: Instead of removing the paragraph right after inserting the HTML, consider modifying the content directly or using a different approach to manage the content within the SDT.

  2. Use HtmlInsertOptions: When inserting HTML, utilize the HtmlInsertOptions to control how the HTML is processed. For example, you can specify options to remove any empty paragraphs that may be causing the border issue.

  3. Check Borders and Shading: After inserting the HTML, check the properties of the SDT to ensure that no unwanted borders or shading are applied. You can explicitly set the border properties to ensure they are as desired.

Code Example

Here’s a modified version of your code that incorporates these suggestions:

// Assuming 'richText' is your SDT and 'builder' is your DocumentBuilder
richText.removeAllChildren(); // Clear existing content

// Create a new paragraph
Paragraph para = new Paragraph(targetDoc);
richText.appendChild(para);

// Move to the paragraph and insert HTML with options
builder.moveTo(para);
HtmlInsertOptions options = new HtmlInsertOptions();
options.setRemoveLastEmptyParagraph(true); // Remove any empty paragraphs
builder.insertHtml(html, options);

// Optionally, check and set border properties
richText.getBorders().setLineStyle(LineStyle.None); // Remove borders if needed

Next Steps

  1. Test the modified code to see if the extra border issue is resolved.
  2. If the problem persists, check the original document’s formatting and styles to ensure there are no conflicting settings.
  3. If you continue to experience issues, please provide additional details or a sample document for further investigation.

Please let me know if this resolves your issue or if you need further assistance!

Need further assistance.
The suggested code HtmlInsertOptions options = new HtmlInsertOptions() does not compile.
Setting REMOVE_LAST_EMPTY_PARAGRAPH like this:

int insertOptions = HtmlInsertOptions.REMOVE_LAST_EMPTY_PARAGRAPH;
builder.insertHtml(html, insertOptions);

also did not help.
Thank you.

@oraspose This is not a bug, but an expected behavior. Structured document tag in your input document is inside table cell with border:

Your HTML also represents a table with border. So after inserting HTML into SDT, there are two nested tables:

To resolve the problem you should either modify your input document and put the whole table into SDT. In this case table will be removed when SDT’s content is cleared. Alternatively, you can modify your input HTML so that it contains only textual content instead of table.