Range.Replace(RegEx, IReplaceingCallBack, bool) obsolete

Hi,

our app is still running Aspose.Words 14.8 (sigh!) and I’m about to update it to 18.1 atm. Almost all Code is still running well (at least after adding Aspose.Words.Replacing namespace) but I got some warnings of obsolete Method calls and the proposed call gave me unwanted results:

public void Build(Document doc, IDocumentParameter parameter)
{
	var regex = new Regex(@"\bGTM_\S*", RegexOptions.Multiline);
	var bookmarks = new List<string>();
// Obsolete Version
	doc.Range.Replace(regex, new StringListCallback(bookmarks), false);
// New Version
	doc.Range.Replace(regex, string.Empty, new FindReplaceOptions() { ReplacingCallback = new StringListCallback(bookmarks), Direction = FindReplaceDirection.Backward });
}

We use ‘GTM_Something’ in our Documents as Bookmarks to get replaced at runtime and the Replace-Statement gets us the list of bookmarks used in a specific document.
The Obsolete Version still works and gives the expected result. The new Version doesn’t. It seems as if the RegEx is interpreted in a different way.
LineBreaks in the Document are not an end of the Bookmark anymore. So if there’s a Part in the Document saying something like

"
bla bla bla GTM_MyBookmark
more Text bla bla
"
The expected (and given from obsolete version) result would be:
‘GTM_MyBookmark’

What we get (from new version) is:
‘GTM_MyBookmark[LineBreak]more’
([LineBreak] is the New Line from the Word Document)

What do I have to do to get this fixed?

@cbiegner,

Thanks for your inquiry. Please ZIP and attach your input and expected output Word documents here for testing. We will investigate the issue on our side and provide you more information.

Fortuna.myDocs.Tests.zip (14.6 KB)

I removed our license file and you will have to remove/add the reference to Aspose.Words again (we use a local package server for this).

I implemented this as a Unit Test Project which perfectly shows the different results. Both Test Methods finish successful but take a look at the bookmarks. They a different although anything but the Range.Replace() is the same.

@cbiegner,

Thanks for sharing the detail. Please use e.MatchNode.ToString(SaveFormat.Text) as shown below to get the desired output.

ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
{
 
    //var name = e.Match.Value.Trim('.').Trim(',').Trim(';').Trim(':');
    var name = e.MatchNode.ToString(SaveFormat.Text);
    if (!List.Contains(name))
    {
        List.Add(name);
    }

    return ReplaceAction.Skip;
}

This worked well. Have to do some more tests, but thank you so far.

:slight_smile:

Unfortunately this is not the ultimate solution. Attached is an original document where most of the bookmarks are found but at least one (‘GTM_Empf_Kopf’) ist not found.
As far as I could figure it out the couse is that this bookmark gets split in different nodes (‘GTM_’, Empf_’, ‘Kopf’) but I don’t know why this happens.
With the former IReplaceingCallback Implementation the bookmark was found anyway since the matching was made on the document, not per node.
Retyping the bookmark once in the document might be a solution but we got hundreds of document with up to 100 bookmarks (and more) in some of them.
Any other way to get this fixed?
GTM_Test_02.zip (18.1 KB)

@cbiegner,

Thanks for your inquiry. Please use the following StringListCallback class to get the desired output.

public class StringListCallback : IReplacingCallback
{
    /// <summary>
    /// Liste of found bookmarks
    /// </summary>       
    List<string> List { get; set; }

    /// <summary>
    /// Fill bookmark list
    /// </summary>
    /// <param name="list">Liste der Ergbnissse</param>
    public StringListCallback(List<string> list)
    {
        List = list;
    }

    /// <summary>
    /// Callback Function
    /// </summary>
    /// <param name="e"></param>
    /// <returns></returns>
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match, 
        // In this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node. 
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        string name = "";
        foreach (Run run in runs)
            name += run.Text;

        if (!List.Contains(name))
        {
            List.Add(name);
        }

        // Signal to the replace engine to do nothing because we have already done all what we wanted.
        return ReplaceAction.Skip;
 
    }

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring(0, position);
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}
1 Like