Replace Text in a Table Cell that contains Line Breaks

We’re evaluating your latest version of Aspose.Words for .Net.

There is a Table that exists in a Word doc. In this table, for a particular cell we would like to execute a search and replace where the text being searched can contain line breaks like \r. The Range.Replace function does not allow for special characters like \r, so we tried using the Replace method with a Regex and a ReplaceEvaluator which partially works. The issue is the line breaks like \r are being left in the text.

I have attached a sample doc. The doc contains one table with 3 columns. Column number 2 and 3 contain line breaks. Lets say I want to do a search and replace on column 2 for “[HRA] [HSA]\r” and replace it with “HRA” how would I go about doing that?

Right now the result of our Replace execution is “HSA\r” instead of “HSA”.

Sample.zip (11.4 KB)

@v6er,

You can replace Paragraph Break by using the following code:

Document doc = new Document("D:\\temp\\Sample\\Sample.docx");

FindReplaceOptions findReplaceOptions = new FindReplaceOptions(FindReplaceDirection.Backward);
findReplaceOptions.ReplacingCallback = new Replacer();

doc.Range.Replace("[HRA] [HSA]", "HRA ", findReplaceOptions);

doc.Save("D:\\Temp\\Sample\\18.6.docx");
////////////////////////////////
public class Replacer : IReplacingCallback
{
    public Replacer()
    {

    }

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring(0, position);

        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }

    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text  before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        // This array is used to store all nodes of the match for  further highlighting.
        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while ((remainingLength > 0) &&
                (currentNode != null) &&
                (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;

            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.            
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode,
            remainingLength);

            runs.Add(currentNode);
        }

        DocumentBuilder builder = new DocumentBuilder((Document)e.MatchNode.Document);
        builder.MoveTo((Run)runs[runs.Count - 1]);
        builder.Write(e.Replacement);

        Run lastRun = (Run)runs[runs.Count - 1];
        Paragraph parentPara = lastRun.ParentParagraph;
        if (lastRun == parentPara.Runs[parentPara.Runs.Count - 1])
        {
            if (parentPara.NextSibling != null && parentPara.NextSibling.NodeType == NodeType.Paragraph)
            {
                Paragraph nextPara = (Paragraph)parentPara.NextSibling;
                foreach (Node node in nextPara.GetChildNodes(NodeType.Any, true))
                {
                    parentPara.AppendChild(node.Clone(true));
                }
                nextPara.Remove();
            }
        }

        //Now remove all runs in the sequence.
        foreach (Run run in runs)
        {
            run.Remove();
        }

        return ReplaceAction.Skip;
    }
}

Thanks for the sample code I’ll try it out. Although I’m not sure it’ll meet our needs. The requirement is to be able to search for text that could contain things like line breaks, tabs, etc…and replace them.

I did try the following but received an error:

Aspose.Words.Replacing.FindReplaceOptions opt = new
Aspose.Words.Replacing.FindReplaceOptions();
opt.FindWholeWordsOnly = false;
opt.MatchCase = false;
opt.Direction = FindReplaceDirection.Backward;

string textToFind = @"[HRA] [HSA]\rNetwork";
string valueToSetTo = “”;
var cnt = cell.Range.Replace(textToFind.Replace(ControlChar.Cr, “&p”), valueToSetTo, opt);

The error I received was Cannot add a node Before/After itself.

@v6er,

To replace Paragraph Break, please use the code mentioned in my previous post. Also, for general Tab and Line Break characters, the following code will work (see input/output documents Docs.zip (18.1 KB)).

Document doc = new Document("D:\\temp\\Sample.docx");
FindReplaceOptions findReplaceOptions = new FindReplaceOptions(FindReplaceDirection.Backward);
doc.Range.Replace("hello " + ControlChar.Tab + " World " + "Great " + ControlChar.LineBreak + " New Line", 
                    "REPLACED", 
                    findReplaceOptions);
doc.Save("D:\\Temp\\18.6.docx");

Please also refer to the following article which explains the Find and Replace feature of Aspose.Words:

Find and Replace

You can also write your own logic by implementing the IReplacingCallback interface.