Splitting a run into 'sub' runs

Working on re-engineering an existing merge process, so it’s not ideal.

The problem I have is that when i’m trying to find the Nodes in the document to represent an iterative block, the tags i’m looking for can be in the middle of a run. The Nodes are then being used to duplicate the content within the nodes for the number of iterations specified.

All works well when the starting tag “<<&foreach” and the end tag “<<&endfor>>” are in a run on their own (i.e. have whitespace around them, but problems occur when the tag is within a Run

e.g. If the Run contains “Extraneous Text<<&foreach”, what I am after is a Node that simply contains “<<&foreach”.

Similar problem occurs if the end tag is within a run

e.g. “Spurious Text<<&endfor>>MoreText”

I’ve looked at using the SplitRun code -
private static Run SplitRun(Run run, int position)
{
Run afterRun = (Run)run.Clone(true);
afterRun.Text = run.Text.Substring(position);

    run.Text = run.Text.Substring(0, position);
    run.ParentNode.InsertAfter(afterRun, run);

    return afterRun;
}

but as I am looping through the Run collection, it causes issues as it modifies the collection

Is there a way to get a Run object for a portion of a run ?

i.e. if the Run contains “Text<<&Foreach>>Text”, I want to get a Run containing only the “<<&Foreach>>” from within that run.

Failing that, is there a good way of replacing the “<<&foreach” and “<<&endfor>>” within a document with a Bookmark ?

I can see how to move a document builder to the Paragraph containing the run, but not how to move the cursor to the start of the relevant tag to insert the bookmark there

@Etaardvark,

Thanks for your inquiry. To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input Word document.
  • Please attach the output Word file that shows the undesired behavior.
  • Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we’ll start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

I’ve created a cut down sample application as requested that shows the problems i’m hitting. Seems to be the same problem manifesting in different places due to the formatting around the tags

It should be attached, though I can’t see it in the post

@Etaardvark,

Thanks for your inquiry. We have not found any attachment with your post. Please ZIP and attach it again. Thanks for your cooperation.

Ok, i’ve tried attaching the zip with the project 3 times now, once with Chrome, Edge and with IE, and every time the file uploads then fails to attach to the post. I’ve even tried from a machine at home, rather than the corporate nextwork and it still doesn’t work.

Therefore i’ve shared the sample application to Github

@Etaardvark,

Thanks for sharing the detail. In ProcessIterativeMarkup method, you are not using Range.Repalce method. Moreover, you are using obsolete version of Range.Replace. Please use latest version of Aspose.Words for .NET 18.8. Following code example finds the shared text and split the Run nodes correctly.

Document doc = new Document(MyDir + "WorkingExpansion.docx");
FindReplaceOptions findOptions = new FindReplaceOptions();
findOptions.ReplacingCallback = new FindAndReplaceTag("test");

doc.Range.Replace("<<&foreach", "", findOptions);
doc.Range.Replace("<<&endfor>>", "", findOptions);

doc.Save(MyDir + "output.docx");

public class FindAndReplaceTag : IReplacingCallback
{
    string text;
    public FindAndReplaceTag(string text)
    {
        this.text = text;
    }
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;
        Console.WriteLine(currentNode.GetText());

        // The first (and may be the only) run can contain text before the match, 
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node. 
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }
 
        // Signal to the replace engine to do nothing because we have already done all what we wanted.
        return ReplaceAction.Skip;
    }

    /// <summary>
    /// Splits text of the specified run into two runs.
    /// Inserts the new run just after the specified run.
    /// </summary>
    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring(0, position);
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

We are using 18.1 at the moment as the Purchase Order for the upgrade is still going through (being delayed by holidays).

Anyway i’ve actually solved the problem I was having by adding some code to split a run based on tags within the run (i.e. if a run contains “Texttext”, it’s split into a number of separate run objects, so I can ensure the processing is always dealing with a distinct node.

        /// <summary>
    /// Splits a run into separate run's conststing of the contents before the start 
    /// </summary>
    private void SplitElements(Run run, string startTerm, string endTerm)
    {
        int currentPosition = 0;
        if (startTerm != string.Empty)
        {
            currentPosition = run.Text.IndexOf(startTerm);
            if (currentPosition > 0)
            {
                // Text before the initial run 
                Run beforeRun = (Run)run.Clone(true);
                beforeRun.Text = run.Text.Substring(0, currentPosition);
                run.Text = run.Text.Substring(currentPosition);
                run.ParentNode.InsertBefore(beforeRun, run);

                // Create a Run for the StartTerm
                Run startTermRun = (Run)run.Clone(true);
                startTermRun.Text = startTerm;
                run.Text = run.Text.Substring(startTerm.Length);
                run.ParentNode.InsertBefore(startTermRun, run);
            }

        }

        if (endTerm != string.Empty)
        {
            // Now get the contents before the end tag
            currentPosition = run.Text.IndexOf(endTerm);
            if (currentPosition > 0)
            {
                Run bodyRun = (Run)run.Clone(true);
                bodyRun.Text = run.Text.Substring(0, currentPosition);
                run.Text = run.Text.Substring(currentPosition);
                run.ParentNode.InsertBefore(bodyRun, run);

                // Create a separate run for the end term
                if (run.Text.Length > endTerm.Length)
                {
                    Run endTermRun = (Run)run.Clone(true);
                    endTermRun.Text = run.Text.Substring(0, endTerm.Length);
                    run.Text = run.Text.Substring(endTerm.Length);
                    run.ParentNode.InsertBefore(endTermRun, run);
                }
            }
        }
    }

@Etaardvark,

Thanks for your feedback. It is nice to hear from you that you have found the solution of your query. Please let us know if you have any more queries.