Free Support Forum - aspose.com

InsertHtml through Replace problem

Hello all

We use Aspose.Words API and got the following issue. We used Replace text specified with regular expression with HTML example to replace specific tags in the document (.docx) attached. But it works incorrectly. It inserts HTML into the wrong place and it doesn’t replace the tag occurence in specific situations. In the attached zip you can see template document (FailedReplaceWithHtml.docx) and replaced document exported to pdf (ClientProposal.pdf). The problem is with {!Proposal information} tag. Note we have two occurences of the tag in the document. The problem is with the last occurense (see the last page).
Can you tell us what we’re doing wrong. Or maybe it is bug of Words component.

Hi

Thanks for your request. Please try using the following method:

[Test]

public void Test002()

{

Document doc = new Document(@"Test001\in.doc");

doc.Range.Replace(new Regex("test"), new ReplaceEvaluatorFindAndInsertHtml("HTML"), false);

doc.Save(@"Test001\out.doc");

}

private class ReplaceEvaluatorFindAndInsertHtml : IReplacingCallback

{

public ReplaceEvaluatorFindAndInsertHtml(string html)

{

mHtml = html;

}

///

/// This method is called by the Aspose.Words find and replace engine for each match.

/// This method replaces the match string, even if it spans multiple runs.

///

ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)

{

// This is a Run node that contains either the beginning or the complete match.

Node currentNode = e.MatchNode;

// The first (and may be the only) run can contain text before the match,

// in this case it is necessary to split the run.

if (e.MatchOffset > 0)

currentNode = SplitRun((Run)currentNode, e.MatchOffset);

// This array is used to store all nodes of the match for further removing.

ArrayList runs = new ArrayList();

// Find all runs that contain parts of the match string.

int remainingLength = e.Match.Value.Length;

while (

(remainingLength > 0) &&

(currentNode != null) &&

(currentNode.GetText().Length <= remainingLength))

{

runs.Add(currentNode);

remainingLength = remainingLength - currentNode.GetText().Length;

// Select the next Run node.

// Have to loop because there could be other nodes such as BookmarkStart etc.

do

{

currentNode = currentNode.NextSibling;

}

while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));

}

// Split the last run that contains the match if there is any text left.

if ((currentNode != null) && (remainingLength > 0))

{

SplitRun((Run)currentNode, remainingLength);

runs.Add(currentNode);

}

// Create Document Buidler aond insert text

DocumentBuilder builder = new DocumentBuilder(e.MatchNode.Document as Document);

builder.MoveTo((Run)runs[runs.Count - 1]);

builder.InsertHtml(mHtml);

// Now remove all runs in the sequence.

foreach (Run run in runs)

run.Remove();

// Signal to the replace engine to do nothing because we have already done all what we wanted.

return ReplaceAction.Skip;

}

///

/// Splits text of the specified run into two runs.

/// Inserts the new run just after the specified run.

///

private static Run SplitRun(Run run, int position)

{

Run afterRun = (Run)run.Clone(true);

afterRun.Text = run.Text.Substring(position);

run.Text = run.Text.Substring(0, position);

run.ParentNode.InsertAfter(afterRun, run);

return afterRun;

}

private string mHtml;

}

Hope this helps. Please let us know if you need more information. We will be glad to help you.

Best regards,

Works well with Range.Replace(regex, callback, false) but doesn’t work with Range.Replace(regex, callback, true). I don’t understand why but it doesn’t matter.

Anyway thank you for your help.

Hi

Thank you for additional information. In your case, you perform some node manipulations in your IReplacingCallback. When you are replacing from the beginning of the range to the end (ifForvard=true), nodes indexes might be changed and that is why the problem occurs. So if you need to perform manipulations with nodes in your IReplacingCallback, I would suggest you replacing from the end of the range to the beginning.

Best regards,