Normal style changed- inserted text not keeping destination styles

Good day -

We are using a Word template and manipulating it with Aspose.Words.

Attached is the template (testSpaceBefore.docx)

Note that there is bookmark at the text “.” The text is set to the Normal style, and the Normal style has a Spacing Before value of 12pt.

We then generate the following HTML to insert at the “” text:

.Normal{}

  • Here is a test bullet

  • Here is another test bullet

We use the following code to insert the HTML into the document at the bookmark location:

<cfif templateDocBuilder.moveToBookmark("options")>
<cfset charset=createObject("java","java.nio.charset.Charset")>
<cfset charset=charset.forName("UTF-8")>
<cfset loadOptions=createObject("java","com.aspose.words.LoadOptions").init()>
<cfset loadOptions.setEncoding(charset)>

<cfset bytestream=createObject("java","java.io.ByteArrayInputStream").init(toBinary(toBase64(testHTMLContent)))>
<cfset tempDoc=createObject("java","com.aspose.words.Document").init(bytestream,loadOptions)>
<cfset ni=createObject("java","com.aspose.words.NodeImporter").init(tempDoc,doc,importFormatMode.USE_DESTINATION_STYLES)>

<cfset cnode.getParentNode().insertAfter(importNode,cnode)>

Note that the NodeImporter is set to USE_DESTINATION_STYLES. We would expect that the inserted HTML would insert the paragraphs as Normal style, and that the Normal style properties are unaffected.

The output document is attached (testSpaceBefore_After.docx). The text is inserted; however, two things have happened:

  • The inserted text is defined as Normal_0, which I think should not have happened, since we indicated to USE_DESTINATION_STYLES.
  • The document’s Normal style no longer has a property of Space Before: 12pt. It is now at 0pt. The insertion seems to have changed the original Normal style, but not applied it.

We are using aspose 15.7.0. Please advise. Thank you.

Hi Chris,

Thanks for your inquiry. In your case, I suggest you please use DocumentBuilder.insertHtml to insert html into document and use “*Normal” in styles. Please check following Java code example for your kind reference. I have attached the output document for your kind reference.

Document doc = new Document(MyDir + "testSpaceBefore.docx");
DocumentBuilder builder = new DocumentBuilder(doc);

builder.moveToBookmark("Options");

builder.insertHtml("<html>"
    + "<head>"
    + "<style>.*Normal{}</style>"
    + "</head>"
    + "<body>"
    + "  <ul>"
    + "    <li>Here is a test bullet</li>"
    + "  <li>Here is another test bullet</li>"
    + "</ul>" 
    + "</body>" 
    + "</html>", true);

doc.save(MyDir + "Out.docx");

Thanks Tahir, but we are using the node method because of another issue. Perhaps you can help with it and then insertHTML can work.

In the original document, there is a custom style defined called PictureCaption.

If we insert the code:

.*Normal{}
.*PictureCaption{}

  • Here is a test bullet

  • Here is another test bullet that should be PictureCaption

using insertHTML, the custom PictureCaption style does not get applied to the second paragraph. It remains Normal (I’ve attached the document).

I tried also defining PictureCaption in the section as .PictureCaption{} but that did not work - it gets inserted as style PictureCaption_0.

Is there a way to get this custom style (defined in the doc file) inserted as PictureCaption?

Thank you.

Hi Chris,

Thanks for your inquiry. Please note that Aspose.Words mimics the same behavior as MS Word does. If you do the same scenario using MS Word, you will get the same output.

  1. You are loading Html into Aspose.Words DOM with empty style “Normal”, “PictureCaption”, the default font formatting will be used for this html document. You can check this scenario by simply loading html document into Aspose.Words DOM and save it to Docx.

  2. You are inserting this document (html) in testSpaceBefore.docx with USE_DESTINATION_STYLES. In this case, an extra font style will be created in output document with name ‘PictureCaption + After: 12 pt’. MS Word also generates the same font style when this html document is inserted in main document. See the attached image for detail.

In your case, I suggest you please do not use the ‘Normal’ style in html and use the PictureCaption as shown below.

.PictureCaption{}

  • Here is a test bullet

  • Here is another test bullet that should be PictureCaption

Instead of inserting each node into the main document, I suggest you please use DocumentBuilder.insertDocument method as shown below. I have attached the output document with this post for your kind reference.

Hope this answers your query. Please let us know if you have any more queries.

Document srcDoc = new Document(MyDir + "testSpaceBefore.docx");
Document dstDoc = new Document(MyDir + "in.html");
dstDoc.save(MyDir + "htmlOut.docx");
DocumentBuilder builder = new DocumentBuilder(srcDoc);
builder.moveToBookmark("Options");
builder.insertDocument(dstDoc, ImportFormatMode.USE_DESTINATION_STYLES);
srcDoc.save(MyDir + "Out.docx");

Thanks Tahir,

I think we can do the insertion without the nodewise conversion.

But I do have one other issue that is coming up: when inserting a table.

The inserted HTML above is modified to include a table:

.*PictureCaption{}

  • Here is a test bullet

  • Here is another test bullet that should be PicureCaption

Here is a table:

Column 1 Column 2 Column 3
Cell 1 Cell 2 Cell 3
Cell 4 Cell 5 Cell 6

The input document contains a table created in Word - notice the cell text retains the property of the Normal style.

The output document shows the inserted table, but I am unable to get the table cells to honor the Normal font. I’ve attached a new input and output document.

I’ve tried putting a Normal class on the elements, putting a
with a Normal class inside each cell, and several other things.

Is it possible to retain the Normal style (or other existing styles within the template) within HTML table cells?

Thank you.

For reference, on a production server, we are using an older version, Aspose Words 14.11.0, and
the above referenced code outputs as expected, as shown in the attached
document.

Hi Chris,

Thanks for your inquiry. I have tested the scenario with shared html and have noticed that the style name start with * does not import in Aspose.Words DOM. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-12272.

Regarding your query about importing styles from html and insert the html with UseDestinationStyles into target document, I have tested this scenario using following code example. The style of table’s contents is not ‘Normal’ style. I have logged this issue as WORDSNET-12273.

I have linked this forum thread to the same issues and you will be notified via this forum thread once these issue are resolved. We apologize for your inconvenience.

Document doc = new Document(MyDir + "testSpaceBefore.docx");
Document htmldoc = new Document(MyDir + "in.html");
DocumentBuilder builder = new DocumentBuilder(doc); 
builder.moveToBookmark("Options");
builder.insertDocument(htmldoc, ImportFormatMode.USE_DESTINATION_STYLES);
doc.save(MyDir + "Out.docx");

Moreover, please note that formatting is applied on a few different levels. For example, let’s consider formatting of simple text. Text in documents is represented by Run element and a Run can only be a child of a Paragraph. You can apply formatting 1) to Run nodes by using Character Styles e.g. a Glyph Style, 2) to the parent of those Run nodes i.e. a Paragraph node (possibly via paragraph Styles) and 3) you can also apply direct formatting to Run nodes by using Run attributes (Font). In this case the Run will inherit formatting of Paragraph Style, a Glyph Style and then direct formatting.

Thank you Tahir.

Please note that for the example, I am using documentBuilder.insertHTML(html,true)

Can you confirm which of the two issue numbers relates to the
example I posted above? In that example, note that I am not trying to import a style, but to apply an existing style (Normal) to the text inside an inserted HTML

Thanks again.

Hi Chris,

Thanks for your inquiry.

*backprop:

Please note that for the example, I am using documentBuilder.insertHTML(html,true)*

If you use insertHTML method to insert html into Word document and your html and Word document have style e.g PictureCaption, the output document will have an extra style PictureCaption_0. The insertHTML method does not use ImportFormatMode. To avoid this, please load the html document into separate Document as shown in my previous post. You can insert this html Document into target Word document with ImportFormatMode as UseDestinationStyles. However, this approach does not return the correct output. The issue ID for this issue is WORDSNET-12273.

*backprop:

Can you confirm which of the two issue numbers relates to the example I posted above? In that example, note that I am not trying to import a style, but to apply an existing style (Normal) to the text inside an inserted HTML

The issue ID is WORDSNET-12273.*

Although I am not able to find it through searching, I wonder if there
is a reference as to how the behavior of insertHTML() and
insertDocument() work with respect to inserting HTML documents with
styles defined in the section.

I have tried so
many combinations of the above, both by altering the HTML document as
well as the insert method, but each time one thing is fixed, another
seems to break.

Very simply, I would like to generate HTML that
consistently re-uses multiple styles that are already defined in an
existing .docx file. In my example, I want to insert HTML that uses the *Normal and PictureCaption styles defined in the base document.

.*Normal{} .PictureCaption{}

This paragraph is Normal and should have Space Before: 12pt.

  • Here is a test bullet that should be Normal and should have Space Before: 12pt.

  • Here is another test bullet that should be PicureCaption

Here is a para that should be PictureCaption

I’ve attached an annotated photo of inserting using the three different methods as well as the before and after docs.

If there is a single method that we can use to insert HTML that:

  • contains paragraphs in the *Normal style after inserting
  • contains paragraphs in the PictureCaption style after inserting

that would be helpful. I can’t find a single combination that works.
Edit: Please ignore the insertHTML() method results, as I posted it prior to your response.
Thank you.

*tahir.manzoor:
Hi Chris,

Thanks for your inquiry.*

backprop:

Please note that for the example, I am using documentBuilder.insertHTML(html,true)

If you use insertHTML method to insert html into Word document and your html and Word document have style e.g PictureCaption, the output document will have an extra style PictureCaption_0. The insertHTML method does not use ImportFormatMode. To avoid this, please load the html document into separate Document as shown in my previous post. You can insert this html Document into target Word document with ImportFormatMode as UseDestinationStyles. However, this approach does not return the correct output. The issue ID for this issue is WORDSNET-12273.

backprop:

Can you confirm which of the two issue numbers relates to the example I posted above? In that example, note that I am not trying to import a style, but to apply an existing style (Normal) to the text inside an inserted HTML

The issue ID is WORDSNET-12273.

OK, thank you. We will watch WORDSNET-12273 for resolution

We will also use the insertDocument method with ImportFormatMode.USE_DESTINATION_STYLES. The behavior seems to mimic the node-wise insertion that I previously used.

But, I am unable to get this method to honor the application of both the .*Normal and PictureCaption styles to different text in the same HTML content.

Hi Chris,

Thanks for your inquiry.

*backprop:

OK, thank you. We will watch WORDSNET-12273 for resolution

We will also use the insertDocument method with ImportFormatMode.USE_DESTINATION_STYLES. The behavior seems to mimic the node-wise insertion that I previously used.*

We will let you know once this issue is resolved.

*backprop:

But, I am unable to get this method to honor the application of both the .Normal and PictureCaption styles to different text in the same HTML content.

The style name start with * does not import
in Aspose.Words DOM. I logged this
problem in our issue tracking system as WORDSNET-12272. We will update you via this forum thread once theses issues are resolved.

The issues you have found earlier (filed as WORDSNET-12272) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Good day.

Is there an implementation scheduled for WORDSNET-12273 resolution?

The issue was introduced in a prior release of Aspose (the function used to work properly). There have now been two releases without resolution.

Thank you.

Hi there,

Thanks for your inquiry. I have verified the status of this issue (WORDSNET-12273) from our issue tracking system and like to share with you that this issue has been planned for development. Hopefully, the fix of this issue will be available in Aspose.Words’ October 2015 release i.e. 15.10.0.

Please note that this
estimate is not final at the moment. We will be sure to inform you via
this forum thread as soon as this issue is resolved.

We appreciate your patience.

Hi there,

Thanks for your patience.

Regarding WORDSNET-12273, we suggest you please use DocumentBuilder.insertHtml method (html, true) instead of insertDocument to fix this issue.

Please let us know if you have any more queries.

*tahir.manzoor:
Hi there,

Thanks for your patience.

Regarding WORDSNET-12273, we suggest you please use DocumentBuilder.insertHtml method (html, true) instead of insertDocument to fix this issue.

Please let us know if you have any more queries.*

Hello,
Are you saying that WORDSNET-12273 is resolved in Aspose.Words 15.9? When inserting a table as per the example, it does not appear to work using either insertHhtml(html,true) or insertDocument().

Thank you.

Hi there,

Thanks for your inquiry. Please check this forum post. This issue was produced by using DocumentBuilder.insertDocument method with shared documents. However, this issue (WORDSNET-12273) cannot be reproduced by using DocumentBuilder.insertHtml (…, true) method.

It seems that you are facing this issue with different documents. Could you please share your input, output and expected output document here for testing purposes? Please share the screenshots of problematic sections of output. We will then provide you more information on this.