Move to Paragraph in DOCX, Highlight Word(s) and Insert HTML (Hyperlink Text Image etc) using C# .NET

Move to a paragraph, then highlight a word/sentence and insert a hyperlink in front of the highlighted word/sentence.

Hi
We have the need to perform the following

1 - Open an existing Word document

2 - Navigate to a paragraph

3 - Move to a word/sentence within the paragraph

4 - High light the word or words

5 - Then insert hyperlink in front of the highlighted word

Basically, we are analysing the documents using a powerful full text search engine, the output of this will give us the following information
• The index (integer) of the paragraph
• The index (integer) of the word/sentence within the paragraph
• The length (integer) of the word/sentence

For example
var paragraphIndex = 2;
var wordIndex = 23;
var wordLength = 4;

So far I have come up with

        var paragraphIndex = 2;
        var wordIndex = 23;
        var wordLength = 4;

        var wordDocument = new Document("C:\\temp\\test.docx");
        var documentBuilder = new DocumentBuilder(wordDocument);

        // move to the paragraph where the hit is present
        documentBuilder.MoveToParagraph(paragraphIndex, 0);

        // TODO highlight the word ???

        // TOOD navigate to the correct location ???

        // insert the anchor tag
        var htmlAnchor = "<a name=\"documenthighlight_nav\" class=\"documentHighlight\"></a>";
        _documentBuilder.InsertHtml(htmlAnchor);

I am struggling with highlighting the word and navigating to the start of the word to insert the hyperlink

Please can you help

@david.hancock.imagef,

Thanks for your inquiry. You can move the cursor to the start or end of paragraph using DocumentBuilder.MoveToParagraph(int paragraphIndex, int characterIndex) method.

Moving the cursor to the particular character inside the Paragraph is not supported in latest version of Aspose.Words. Your request has been linked to the appropriate issue (WORDSNET-10148) and you will be notified as soon as moving cursor to a particular character position in a Paragraph is supported. We apologize for your inconvenience.

However, in your case we suggest you following solution.

  1. Move the cursor to the desired paragraph.
  2. Get the text of paragraph using Paragraph.ToString(SaveFormat.Text).
  3. Get the text that you want to highlight from the paragraph’s text using String.Substring method.
  4. Use Find and Replace feature to find the text and highlight it.
  5. After highlighting the text, please move the cursor to the last Run node of highlighted text and insert the HTML using DocumentBuilder.InsertHtml.

@david.hancock.imagef,

Regarding WORDSNET-10148, you may use the following code as a workaround. Please see these sample input/output Word documents (Docs.zip (17.6 KB)). Hope, this helps:

int paragraphIndex = 0;
int characterIndex = 4;
string text = "demo";

Document doc = new Document(@"E:\\Temp\\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);

Paragraph targetPara = (Paragraph)builder.CurrentStory.GetChildNodes(NodeType.Paragraph, true)[paragraphIndex];
Node[] runs = targetPara.GetChildNodes(NodeType.Run, true).ToArray();

for (int i = 0; i < runs.Length; i++)
{
    Run run = (Run)runs[i];
    int length = run.Text.Length;

    Run currentNode = run;
    for (int x = 1; x < length; x++)
    {
        currentNode = SplitRun(currentNode, 1);
    }
}

if (characterIndex >= 0 && characterIndex < targetPara.Runs.Count)
{
    builder.MoveTo(targetPara.Runs[characterIndex]);
    builder.Font.Name = "Verdana";
    builder.Font.Size = 16;
    builder.Font.Color = Color.Red;
    builder.Write(text);
}
else
{
    Console.WriteLine("Incorrect character index specified");
}

doc.JoinRunsWithSameFormatting();
doc.Save(@"E:\\Temp\\20.3.docx");

private static Run SplitRun(Run run, int position)
{
    Run afterRun = (Run)run.Clone(true);
    afterRun.Text = run.Text.Substring(position);
    run.Text = run.Text.Substring((0), (0) + (position));
    run.ParentNode.InsertAfter(afterRun, run);
    return afterRun;
}

We will also inform you via this thread as soon as WORDSNET-10148 will be resolved in future.

The issues you have found earlier (filed as WORDSNET-10148) have been fixed in this Aspose.Words for .NET 21.2 update and this Aspose.Words for Java 21.2 update.