Range.Replace merges Runs

VincentVanHulst · May 20, 2011, 10:37am

Hi,

Currently i am working on a project where i do a database-merge with several parameters in a word-document. An example of a replacement is:

{@117_customername}This is demotext{@}

The rule is simple:

IF {@117_customername} is empty, remove the whole line (usually a paragraph) including the text up untill the next {@}.
IF {@117_customername} DOES exist, keep the line BUT remove the tags {@117_customername} and {@} so that only the ‘This is demotext’ remains.
For this i use Paragraph.range.replace(currentValue, replaceValue, false, false), where currentValue is the whole text of the paragraph and replaceValue is the returnvalue of a function which does the lookup and replace on the whole string.
Now, everyting works fine, except for the font-formatting. If in the line ‘this is demotext’ the word ‘this’ is colored RED and ‘is demotext’ is colored BLACK, after the replace the whole line will become RED. That is, because of the fact that the range.replace merges the runs to 1 run and uses the font of the first run.
I hope the above description / approach is clear. How can I make sure that the formatting is kept in place / necessary runs in the paragraph remain in place?
Thanx!

adam.skelton · May 20, 2011, 8:37pm

Hi Vincent,
Thanks for your inquiry.
Your explanation makes sense but could you please attach your code here for testing as well? This would make checking things a lot easier.
Thanks,

VincentVanHulst · May 23, 2011, 4:21am

Hi Adam,

Thanks for the reply. Let’s skip this one because it’s sort of common sense that if you do a range.replace and as input you insert the whole string of the range, it’s not possible to keep the runs. (If a paragraph has 2 runs with text and I replace the whole paragraph with a string, there is no way that aspose can re-split that into runs as it doesn’t know which part of the text belongs to which run).
BUT, i fixed this using a IReplacingCallback.Replacing on the paragraph for each part that has to be replaced. Is looks promising but if i use e.Matchnode, is seems like it starts from the back of the paragraph and it works its way to the start of the paragraph. Result is that if i have a paragraph with:

{@117}part 1{@}{@124}part 2{@}
And i search for “{@}” with e.Matchnode, it returns the last run (Belonging to 124), instead of the first run with the "{@}" (belonging to 117).
How can that be changed?

VincentVanHulst · May 23, 2011, 4:29am

The code used is as follows:

Private Class ParagraphReplaceHelper
    Public Sub New(ByVal docParagraph As Paragraph)
        mParagraph = docParagraph
    End Sub
    Public Sub Replace(ByVal oldText As String, ByVal newText As String)
        mParagraph.Range.Replace(New Regex(Regex.Escape(oldText)), New ReplaceEvaluatorFindAndInsertText(newText), False)

    End Sub
    Private Class ReplaceEvaluatorFindAndInsertText
        Implements IReplacingCallback
        Public Sub New(ByVal text As String)
            mText = text
        End Sub
        Private Function IReplacingCallback_Replacing(ByVal e As ReplacingArgs) As ReplaceAction Implements IReplacingCallback.Replacing

            ' This is a Run node that contains either the beginning or the complete match.
            Dim currentNode As Node = e.MatchNode
            ' The first (and may be the only) run can contain text before the match,

            ' in this case it is necessary to split the run.
            If e.MatchOffset > 0 Then
                currentNode = SplitRun(DirectCast(currentNode, Run), e.MatchOffset)

            End If
            ' This array is used to store all nodes of the match for further removing.
            Dim runs As New ArrayList()

            ' Find all runs that contain parts of the match string.
            Dim remainingLength As Integer = e.Match.Value.Length
            While (remainingLength > 0) AndAlso (currentNode IsNot Nothing) AndAlso (currentNode.GetText().Length <= remainingLength)
                runs.Add(currentNode)
                remainingLength = remainingLength - currentNode.GetText().Length

                ' Select the next Run node. 
                ' Have to loop because there could be other nodes such as BookmarkStart etc.
                Do
                    currentNode = currentNode.NextSibling
                Loop While (currentNode IsNot Nothing) AndAlso (currentNode.NodeType <> NodeType.Run)

            End While
            ' Split the last run that contains the match if there is any text left.
            If (currentNode IsNot Nothing) AndAlso (remainingLength > 0) Then
                SplitRun(DirectCast(currentNode, Run), remainingLength)
                runs.Add(currentNode)

            End If
            'Do necessary stuff here. But the run in the array contains the wrong run.

            Return ReplaceAction.Stop
        End Function
        Private Shared Function SplitRun(ByVal run As Run, ByVal position As Integer) As Run
            Dim afterRun As Run = DirectCast(run.Clone(True), Run)
            afterRun.Text = run.Text.Substring(position)
            run.Text = run.Text.Substring(0, position)
            run.ParentNode.InsertAfter(afterRun, run)

            Return afterRun
        End Function
        Private mText As String
    End Class
    Private mParagraph As Paragraph
    End Class

VincentVanHulst · May 23, 2011, 4:33am

Oh yes and i call this replacement functions via code below. Import is that i only want to replace the FIRST occurence of the {@} in the paragraph with “”. Other occurrences will be hit later on by my own functions. If all of the {@} matches in the paragraph will be replaced at once, other replace-rules won’t work correctly.
Dim parReplacer As New ParagraphReplaceHelper(docParagraph)

parReplacer.Replace("{@}", “”)

alexey.noskov · May 23, 2011, 2:25pm

Hi
Thanks for your request. Match node returns the last run because you are searching for occurrences from the end to the beginning of the document:

mParagraph.Range.Replace(New Regex(Regex.Escape(oldText)), New ReplaceEvaluatorFindAndInsertText(newText), False)

If you change this parameter to true, Aspose.Words will search from the beginning to the end. But if you need to perform node manipulations in your ReplacingCallback, I would suggest you to keep searching from the end to the beginning.
Also, as I can see, what you need is just fill the template with data from your data source. If so, I suppose Mail Merge would be a better option for you then replacing placeholders:
https://docs.aspose.com/words/net/types-of-mail-merge-operations/
In case of using Mail Merge, you can use MERGEFORMAT switch with a mergefield, in this case, formatting of mergefield will be inherited by the value.
Best regards,

awais.hafeez · November 3, 2018, 3:25am

A post was split to a new topic: Consolidating Runs during Find and Replace