I’m currently trying to find a solution for a problem we’ve encountered within our implementation of Aspose.Words (Version: 13.1.0.0).
The situation is as follows:
A user can create a Word Document which will be used as a template for a report within our application. Within the Template, the user can insert variables which should be replaced with content collected from the application, which matches the variable declaration (See attachment “Audit report Template.doc” for an example of such a template). Upon generating a report, based on the previously mentioned template, the application will collect all user generated content from the application and inserts it at the designated position within the document. The content which needs to be inserted into the report, can contain HTML Code. In order to insert this particular type of content, we use the “DocumentBuilder.InsertHtml(strHTMLStringToInsert)” method.
The problem we’re facing is that, once the HTML has been inserted into the DocumentBuilder, the paragraph format of the inserted content won’t match the paragraph format defined within the Template document. Next to that, if an Order List (OL) or Unordered List (UL) HTML-tag has been used within the HTML code, then each individual List Item (LI) will have its own Paragraph node, resulting in a List where a Paragraph Spacing has been placed after each List Item (See attachment “Result.doc” for an example of a report after inserting the HTML Code at the designated position(s)).
In order to remove the Spacing from each individual List Item (except for the last Item within the List), we would traverse through the document, verify if the NextSibling of the current ListItem is also considered to be a list item and part of the same (un)ordered list. If that’s the case then we would manually set the “SpacingAfter” property of the ParagraphFormat.
But, since we cannot detect which of the Lists have been added via the InsertHTML method, we would update all of the Lists found within the document. Also the Lists which should remain unaltered.
What we are trying to achieve is to insert HTML Code while preserving the paragraph format of the original template, without having to traverse through the document and manually set the ParagraphFormat for each List Item. (See Attachment “Desired Result.doc”)
What would you suggest as an alternative means for solving this problem?
See Attachment “FormattedMetafieldContents_HTML.txt” for an example of the HTML Code we want to insert into the Report Document by using Aspose.Words. The code we use for inserting the HTML Code is listed below.
Public Function Replacing(ByVal e As Aspose.Words.ReplacingArgs) As Aspose.Words.ReplaceAction Implements Aspose.Words.IReplacingCallback.Replacing
'create DocumentBuilder object
Dim objDocumentBuilder As New Aspose.Words.DocumentBuilder(CType(e.MatchNode.Document, Aspose.Words.Document))
'get the concering node
Dim objCurrentNode As Node = e.MatchNode
'the first (and may be the only) run can contain text before the match, in this case it is necessary to split the run
If e.MatchOffset > 0 Then
objCurrentNode = AH.AsposeHelper.SplitRun(CType(objCurrentNode, Run), e.MatchOffset)
End If
'if there is some other text after the current run(match), split that too
If objCurrentNode.GetText.Length > e.Match.Value.Length Then
objCurrentNode = AH.AsposeHelper.SplitRun(CType(objCurrentNode, Run), e.Match.Value.Length, objCurrentNode.GetText.Length - e.Match.Value.Length)
End If
'the node that contains text should be a Run
Dim objRun As Run = DirectCast(objCurrentNode, Run)
'move to the matching node
objDocumentBuilder.MoveTo(objRun)
'clear the text of the Run
objRun.Text = ""
'insert the value (= HTML)
objDocumentBuilder.InsertHtml(_strReplacementValue)
'return, indicating "Skip", because we manually replaced the value
_blnFound = True
Return Aspose.Words.ReplaceAction.Skip
End Function
With kind regards,
Tom Pouwelse
Software Engineer
Infoland BV.