Word Doc: Convert Special Tag to Bookmark and Link to Bookmark

nitin.mistry.bell.ca · November 7, 2012, 3:14pm

I have a word document with many special tags that represent
a bookmark and a link to a bookmark. I want to replace these tags with a proper bookmark and link to bookmarks.(see attached example doc: special_tags.docx) here’s what i mean…

Bookmark Tag:
The format of a bookmark tag in my word doc is ** [] **
where x is the bookmark.

for example:
[] = This represents a bookmark called b1

How can I scan my word document and find all Tags like this [] and
replace them with a correct word bookmark.

Link to Bookmark Tag:
The format for an internal link to a bookmark is: ** [ > T~x < ] **
where
T = the display text for the link

x = bookmark

for example:
[>Go to Special Offers~sp1<]

This tag should be replaced with a link that looks like this: Go to Special Offers

When the user CTRL+Clicks this link he jumps to the sp1 bookmark.

I have lots of these types of tags which were generated dynamically so i do not know
what the bookmark id is or the display text.

I have attached 2 docs as an example.
-Original word doc with tag: special_tags.docx\ - The same document after the tags have been replaced: tags_converted.docx

awais.hafeez · November 8, 2012, 9:03am

Hi Nitin,

Thanks for your inquiry.

*Nitin:

> How can I scan my word document and find all Tags like this[] and replace them with a correct word bookmark.*

Here is the code to achieve this.

Document doc = new Document(@"C:\Temp\special_tags.docx");
// Replace tags with Bookmarks
doc.Range.Replace(new Regex("\\[<[a-z]+\\d>\\]"), new InsertBookmarkAtReplaceHandler(), true);
doc.Save(@"C:\Temp\out.docx");

private class InsertBookmarkAtReplaceHandler : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;
        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);
        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();
        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
        (remainingLength > 0) &&
        (currentNode != null) &&
        (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;
            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }
        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }
        string tagName = e.Match.Value.Replace("[<", "").Replace(">]", "");
        DocumentBuilder builder = new DocumentBuilder((Document)e.MatchNode.Document);
        builder.MoveTo(e.MatchNode);
        builder.MoveTo((Run)runs[runs.Count - 1]);
        builder.StartBookmark(tagName);
        builder.EndBookmark(tagName);
        return ReplaceAction.Replace;
    }
    ///
    /// Splits text of the specified run into two runs.
    /// Inserts the new run just after the specified run.
    ///
    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring(0, position);
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

Moreover, I am working over the second part of your request and will get back to you soon.

Best Regards,

awais.hafeez · November 22, 2012, 7:16am

Hi Nitin,

Thanks for your patience. For the second part of your problem, I think the following routine will serve the purpose.

private class LinkToBookmarkAtReplaceHandler : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;
        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);
        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();
        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;

        while (
        (remainingLength > 0) &&
        (currentNode != null) &&
        (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;
            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }
        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }
        string tagName = e.Match.Value.Replace("[>", "").Replace("<]", "");
        string[] splits = tagName.Split(new char[] { '~' });
        string displayText = splits[0];
        string bookmarkName = splits[1];
        DocumentBuilder builder = new DocumentBuilder((Document)e.MatchNode.Document);
        builder.MoveTo(e.MatchNode);
        builder.MoveTo((Run)runs[runs.Count - 1]);
        builder.InsertHyperlink(displayText, bookmarkName, true);
        return ReplaceAction.Replace;
    }
    ///
    /// Splits text of the specified run into two runs.
    /// Inserts the new run just after the specified run.
    ///
    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring(0, position);
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

Best Regards,

nitin.mistry.bell.ca · November 22, 2012, 7:47am

That’s fantastic! It worked.
Thank you so much!

awais.hafeez · November 23, 2012, 2:28am

Hi Nitin,

Thanks for your feedback. Please let us know any time you have any further queries. We’re always glad to help you.

Best Regards,