Write to Content Control after specified body text

SCDGLC · August 29, 2019, 4:31pm

Within a file, I want to populate the first empty content control of a specified tag after some specified text that comes after a bookmark. For example:

ANIMALS
There are many animals in our park.
These include our resident parrot who is called [Name].
And our ever popular miniature pony called [Name].
KEEPERS
We have lots of helpers at our park.
Our head keeper is [Name].
Our resident vet is [Name].

In this example Animals and Keepers are both bookmarks and Name represents a content control tag. I want to know how I would add the words ‘John Barnes’ to the head keeper Name content control?

I know how to find a bookmark and I know how to populate a specified content control after finding a bookmark, but I don’t know how to search for specific body text and then populate the first content control with the specified tag that’s found after the specified body text.

Note that in this scenario, the specified body text to search for would be ‘head keeper’.

awais.hafeez · August 30, 2019, 4:09am

@SCDGLC,

To ensure a timely and accurate response, please ZIP and attach the following resources here for testing:

Your simplified input Word document
Aspose.Words 19.8 generated output document showing the undesired behavior (if any)
Your expected document showing the correct output. You can create expected document by using MS Word.

As soon as you get these pieces of information ready, we will start further investigation into your scenario and provide you code to achieve the same by using Aspose.Words. Thanks for your cooperation.

SCDGLC · August 30, 2019, 1:09pm

Ok so the example I gave is the entire document, there is nothing else. And I cannot give you the sample code and generated output as the point is that I don’t know how to extract general text. Is it that general text in a Word document is exposed via a Structured Document Tag in Aspose? Because I don’t know how to find specified text otherwise. And if it is, what type is it? And once I find it, I then need to be able to move to that position so that I can populate the next appropriate content control. I know that’s possible for bookmarks but is that also possible for general text? Sorry but I cannot write the code and generate any output without knowing all this first.

Many thanks.

SCDGLC · August 30, 2019, 2:47pm

Here is the sample text converted to a document though…

Sample Doc.zip (14.8 KB)

awais.hafeez · August 31, 2019, 3:14am

@SCDGLC,

Please see these input/output Word documents (Docs.zip (30.0 KB)) and try running the following code:

Document doc = new Document("E:\\Temp\\Sample Doc\\Sample Doc.docx");

Bookmark bm = doc.Range.Bookmarks["KEEPERS"];

Paragraph start = (Paragraph)bm.BookmarkStart.GetAncestor(NodeType.Paragraph);
StructuredDocumentTag targetSdt = null;

Paragraph para = (Paragraph)start;
bool flag = true;
while (para != null && flag)
{
    foreach(StructuredDocumentTag sdt in para.GetChildNodes(NodeType.StructuredDocumentTag, true))
    {
        if (sdt.Tag.Equals("Name"))
        {
            targetSdt = sdt;
            flag = false;
            break;
        }
    }

    para = (Paragraph)para.NextSibling;
}

if (targetSdt != null)
{
    Run clone = (Run)targetSdt.FirstChild;
    targetSdt.RemoveAllChildren();
    clone.Text = "John Barnes";
    targetSdt.AppendChild(clone);
}

doc.Save("E:\\Temp\\Sample Doc\\19.8.docx");

Hope, this helps.

SCDGLC · September 1, 2019, 3:31pm

Many thanks. However although I haven’t run the code yet, from reading through your code, it seems to me that would only work for the first empty content control found, which in this case is ‘head keeper’. What happens if the name of the head keeper is null (i.e. empty) and I only want to populate the resident vet’s name and don’t know how many other content controls there are between the content control I want to populate and the bookmark? This is why I was talking about searching for specific body text (e.g. ‘resident vet’) and populating a content control only after the specified body text has been found and moved to.

awais.hafeez · September 2, 2019, 5:01am

@SCDGLC,

To ensure a timely and accurate response, please ZIP and attach the following resources here for testing:

Your simplified input Word document covering all cases
Aspose.Words 19.8 generated output document showing the undesired behavior
Your expected document showing the correct output. You can create expected document by using MS Word.

As soon as you get these pieces of information ready, we will start further investigation into your scenario and provide you code to achieve the same by using Aspose.Words. Thanks for your cooperation.

SCDGLC · September 2, 2019, 9:04am

@awais.hafeez Everything is already attached as zip files in this conversation?? I don’t know what else you want me to provide.

awais.hafeez · September 3, 2019, 3:31am

@SCDGLC,

You can find any text in Word document by using the Find and Replace features of Aspose.Words. You can build logic on the following code to get the desired output:

Document doc = new Document(@"E:\Temp\Sample Doc\\Sample Doc.docx");

FindReplaceOptions options = new FindReplaceOptions();
options.ReplacingCallback = new FindAndReplace();

doc.Range.Replace(new Regex("helpers at our"), "", options);

doc.Save(@"E:\Temp\Sample Doc\\19.8.docx");

private class FindAndReplace : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        // To do
        Paragraph parentPara = ((Run)runs[0]).ParentParagraph;

        // for example replace it with SOMETHING
        DocumentBuilder builder = new DocumentBuilder((Document)e.MatchNode.Document);
        builder.MoveTo((Run)runs[0]);
        builder.Write("SOMETHING");

        foreach (Run run in runs)
            run.Remove();

        return ReplaceAction.Skip;
    }

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring((0), (0) + (position));
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

SCDGLC · September 4, 2019, 1:04pm

My client’s license is for 2016 and doesn’t allow for that functionality. However they are in the process of upgrading, so as soon as I have the new license I’ll give it a go.

Many thanks.

awais.hafeez · September 4, 2019, 4:17pm

@SCDGLC,

Sure, we will wait for your further input on this topic.

SCDGLC · September 10, 2019, 1:06pm

Ok so apparently I now have the latest license for Aspose Word, however FindReplaceOptions and RunExamples are not recognised and Aspose.Words.Replacing doesn’t exist?

Am I correct in thinking that I can’t have been given the most up to date license, as I know that Replacing requires v19.9?

SCDGLC · September 10, 2019, 3:02pm

PS: please can you expand on what exactly you mean by ‘This is a simplistic method that will only work well when the match starts at the beginning of a run’?

Many thanks

awais.hafeez · September 11, 2019, 3:58am

@SCDGLC,

In addition to acquiring the latest license, you also need to upgrade to the latest version of Aspose.Words for .NET API i.e. 19.9. You can either download Aspose.Words for .NET and reference the DLL in your project or install Aspose.Words for .NET 19.9 via NuGet.

I have removed this comment from my previous code. The code should work for all scenarios. Please let us know if we can be of any further assistance.

SCDGLC · September 12, 2019, 11:50am

Thanks, I’ve now installed the DLL.

However looking at your code, you don’t do the search and replace under a bookmark. How do I search for text that comes after a bookmark? Basically I need to find the first instance of some specified text that comes after a given bookmark.

awais.hafeez · September 12, 2019, 5:49pm

@SCDGLC,

Please see these input/output documents Docs.zip (30.0 KB) and try running the following code:

Document doc = new Document("E:\\temp\\Sample Doc\\Sample Doc.docx");

FindReplaceOptions options = new FindReplaceOptions();
options.ReplacingCallback = new FindAndReplace();

doc.Range.Replace(new Regex("helpers at our"), "", options);

doc.Save("E:\\Temp\\Sample Doc\\19.9.docx");

private class FindAndReplace : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        // To do
        Paragraph start = ((Run)runs[0]).ParentParagraph;
        StructuredDocumentTag targetSdt = null;

        Paragraph para = start;
        bool flag = true;
        while (para != null && flag)
        {
            foreach (StructuredDocumentTag sdt in para.GetChildNodes(NodeType.StructuredDocumentTag, true))
            {
                if (sdt.Tag.Equals("Name"))
                {
                    targetSdt = sdt;
                    flag = false;
                    break;
                }
            }

            para = (Paragraph)para.NextSibling;
        }

        if (targetSdt != null)
        {
            Run clone = (Run)targetSdt.FirstChild;
            targetSdt.RemoveAllChildren();
            clone.Text = "John Barnes";
            targetSdt.AppendChild(clone);
        }

        foreach (Run run in runs)
            run.Remove();

        return ReplaceAction.Skip;
    }

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring((0), (0) + (position));
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

Hope, this helps.

SCDGLC · September 13, 2019, 5:19pm

That’s helpful thanks, however still a couple of issues.

Firstly, just FYI your code deleted the search text, which I tried to fix in the code below but oddly it didn’t work as expected.

Secondly, your code writes to the content control text prompt rather than entering the text in it as you would do manually (demonstrated by the fact the text is greyed out). Again I tried to fix this but it didn’t work as expected.

I’m not sure if that has anything to do with the fact it looks like my client has given me an evaluation license rather than a full license, though I would assume not?

Thirdly, I have made a few other changes to the code. though it’s not yet working, please see below…

//
WriteFieldAfter(@“c:\temp\sample doc.docx”,“KEEPERS”,“resident vet”,“success”);
//

//
private static void WriteFieldAfter(string filenameAndPath, string bookmarkName, string searchText, string insertionText)
{
Document doc = new Document(filenameAndPath);

        FindReplaceOptions options = new FindReplaceOptions();

        FindAndReplace findAndReplace = new FindAndReplace
        {
            BookmarkName = bookmarkName,
            InsertionText = insertionText
        };

        options.ReplacingCallback = findAndReplace;

        doc.Range.Replace(new Regex(searchText), searchText, options);

        doc.Save(filenameAndPath);
    }

//

//
private class FindAndReplace : IReplacingCallback
{
public object BookmarkName { get; internal set; }
public string InsertionText { get; internal set; }

        ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
        {
            // This is a Run node that contains either the beginning or the complete match.
            Node currentNode = e.MatchNode;

            // The first (and may be the only) run can contain text before the match,
            // in this case it is necessary to split the run.
            if (e.MatchOffset > 0)
                currentNode = SplitRun((Run)currentNode, e.MatchOffset);

            // This array is used to store all nodes of the match for further removing.
            ArrayList runs = new ArrayList();

            // Find all runs that contain parts of the match string.
            int remainingLength = e.Match.Value.Length;

            while ((remainingLength > 0) &&
                (currentNode != null) &&
                (currentNode.GetText().Length <= remainingLength))
            {
                runs.Add(currentNode);
                remainingLength = remainingLength - currentNode.GetText().Length;

                // Select the next Run node.
                // Have to loop because there could be other nodes such as BookmarkStart etc.
                do
                {
                    currentNode = currentNode.NextSibling;
                }
                while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
            }

            // Split the last run that contains the match if there is any text left.
            if ((currentNode != null) && (remainingLength > 0))
            {
                SplitRun((Run)currentNode, remainingLength);
                runs.Add(currentNode);
            }

            // To do
            Paragraph start = ((Run)runs[0]).ParentParagraph;
            StructuredDocumentTag targetSdt = null;
            Paragraph para = start;
            bool flag = true;

            while (para != null && flag)
            {
                foreach (StructuredDocumentTag sdt in para.GetChildNodes(NodeType.StructuredDocumentTag, true))
                {
                    if (sdt.Tag.Equals(BookmarkName))
                    {
                        targetSdt = sdt;
                        flag = false;
                        break;
                    }
                }
                para = (Paragraph)para.NextSibling;
            }

            if (targetSdt != null)
            {
                Run clone = (Run)targetSdt.FirstChild;
                clone.Font.Name = "Arial";
                clone.Font.Size = 10;
                targetSdt.RemoveAllChildren();
                clone.Text = InsertionText;
                targetSdt.AppendChild(clone);
                
                if (targetSdt.Level == MarkupLevel.Inline)
                {
                    targetSdt.AppendChild(clone);
                }

                if (targetSdt.Level == MarkupLevel.Block)
                {
                    para.ParagraphFormat.Alignment = ParagraphAlignment.Left;
                    para.AppendChild(clone);
                    targetSdt.AppendChild(para);
                }
            }

            foreach (Run run in runs)
                run.Remove();

            return ReplaceAction.Skip;
        }

        private static Run SplitRun(Run run, int position)
        {
            Run afterRun = (Run)run.Clone(true);

            afterRun.Text = run.Text.Substring(position);

            run.Text = run.Text.Substring((0), (0) + (position));

            run.ParentNode.InsertAfter(afterRun, run);

            return afterRun;
        }
    }

//

Fourthly, I’m attaching a new document to work on and another document showing the expected output (i.e. entering the name Sarah Jones in the appropriate content control).

Sample.zip (56.8 KB)

Many thanks!

awais.hafeez · September 14, 2019, 4:55am

@SCDGLC,

The first two problems should be fixed by the following code:

private class FindAndReplace : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        // To do
        Paragraph start = ((Run)runs[0]).ParentParagraph;
        StructuredDocumentTag targetSdt = null;

        Paragraph para = start;
        bool flag = true;
        while (para != null && flag)
        {
            foreach (StructuredDocumentTag sdt in para.GetChildNodes(NodeType.StructuredDocumentTag, true))
            {
                if (sdt.Tag.Equals("Name"))
                {
                    targetSdt = sdt;
                    flag = false;
                    break;
                }
            }

            para = (Paragraph)para.NextSibling;
        }

        if (targetSdt != null)
        {
            targetSdt.IsShowingPlaceholderText = false;
            Run newTextRun = new Run(start.Document, "John Barnes");
            targetSdt.RemoveAllChildren();
            targetSdt.AppendChild(newTextRun);
        }

        return ReplaceAction.Skip;
    }

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring((0), (0) + (position));
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

Regarding the fourth problem, the following code produces the expected output (see 19.9.zip (25.7 KB)):

Document doc = new Document("E:\\temp\\sample\\Sample Doc.docx");

FindReplaceOptions options = new FindReplaceOptions();
options.ReplacingCallback = new FindAndReplace();

doc.Range.Replace(new Regex("specialists at our"), "", options);

doc.Save("E:\\Temp\\sample\\19.9.docx");

SCDGLC · September 14, 2019, 1:35pm

Many thanks, however you have failed to produce the output requested (please see the output document attached to my last message).

As you can see from that, I need to be able to identify the bookmark the searched text comes after as there may be duplicate search and insertion texts.

Also I removed the hardcoded search text and insertion text and you put it back in. I need the bookmark name, the searched for text, the content control tag/title and the text to be inserted into the content control to all be generic and not hardcoded as these will vary between documents.

Many thanks.

awais.hafeez · September 15, 2019, 5:22am

@SCDGLC,

Please check the following code. Hope, this helps.

WriteFieldAfter("E:\\temp\\sample\\Sample Doc.docx",
                "E:\\Temp\\sample\\19.9.docx",
                "SPECIALISTS",
                "head keepers",
                "success");

private static void WriteFieldAfter(string filenameAndPath, string outputfilenameAndPath, string bookmarkName, string searchText, string insertionText)
{
    Document doc = new Document(filenameAndPath);

    FindReplaceOptions options = new FindReplaceOptions();
    FindAndReplace findAndReplace = new FindAndReplace
    {
        BookmarkName = bookmarkName,
        InsertionText = insertionText
    };
    options.ReplacingCallback = findAndReplace;

    doc.Range.Replace(new Regex(searchText), searchText, options);
    doc.Save(outputfilenameAndPath);
}


private class FindAndReplace : IReplacingCallback
{
    public object BookmarkName { get; internal set; }
    public string InsertionText { get; internal set; }

    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;

        while ((remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        // To do
        Document doc = ((Document)e.MatchNode.Document);
        Bookmark bm = doc.Range.Bookmarks[BookmarkName.ToString()];
        Paragraph bmPara = (Paragraph)bm.BookmarkStart.GetAncestor(NodeType.Paragraph);
        if (bmPara != null)
        {
            int bmParaIndex = doc.GetChildNodes(NodeType.Paragraph, true).IndexOf(bmPara);
            Paragraph start = ((Run)runs[0]).ParentParagraph;
            int searchTextParaIndex = doc.GetChildNodes(NodeType.Paragraph, true).IndexOf(start);

            if (searchTextParaIndex > bmParaIndex)
            {
                StructuredDocumentTag targetSdt = null;

                Paragraph para = start;
                bool flag = true;
                while (para != null && flag)
                {
                    foreach (StructuredDocumentTag sdt in para.GetChildNodes(NodeType.StructuredDocumentTag, true))
                    {
                        if (sdt.Tag.Equals("Name"))
                        {
                            targetSdt = sdt;
                            flag = false;
                            break;
                        }
                    }

                    para = (Paragraph)para.NextSibling;
                }

                if (targetSdt != null)
                {
                    targetSdt.IsShowingPlaceholderText = false;
                    Run newTextRun = new Run(start.Document, InsertionText);
                    newTextRun.Font.Name = "Arial";
                    newTextRun.Font.Size = 10;
                    targetSdt.RemoveAllChildren();

                    if (targetSdt.Level == MarkupLevel.Inline)
                    {
                        targetSdt.AppendChild(newTextRun);
                    }

                    if (targetSdt.Level == MarkupLevel.Block)
                    {
                        para.ParagraphFormat.Alignment = ParagraphAlignment.Left;
                        para.AppendChild(newTextRun);
                        targetSdt.AppendChild(para);
                    }
                }
            }                    
        }

        return ReplaceAction.Skip;
    }    

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring((0), (0) + (position));
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}