Hi,
Jumping straight into the point:
- I am converting a word document into html,
- I want to traverse through the converted html document, find certain word (predefined keyword) and establish a link( anchor tag around) that particular word.
For example, contents of the word document:
“The R&D program is largely funded by producer levies, with matched funding from the Federal Government.
Levies are also collected by the processing, lotfeeding and live export sectors, for investment in projects that support the red meat supply chain beyond the farm gate.”
I want to find the word(funded here) and establish a link to it. e.g. …funded.
I don’t know how to do that, or is it even do-able? Also, is there any way I can get structured html tags after its been converted from word. The html file at the moment looks pretty bad so much spans and divs.
Thanks in advance for the help.
Hi Bikram,
Thanks for your inquiry. You can achieve your requirements by implementing IReplacingCallback interface. Please use following code example to find the text and insert hyperlink. We suggest you please read following documentation link. Hope this helps you.
Find and Replace
Document doc = new Document(MyDir + "in.docx");
doc.Range.Replace(new Regex("funded"), new FindandInsertHyperlink("https://www.google.com/", "funded"), false);
doc.Save(MyDir + "Out.docx");
public class FindandInsertHyperlink : IReplacingCallback
{
private string hyperlink;
private string displayText;
DocumentBuilder builder;
public FindandInsertHyperlink(string hyperlink, string displayText)
{
this.hyperlink = hyperlink;
this.displayText = displayText;
}
ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
{
// This is a Run node that contains either the beginning or the complete match.
Node currentNode = e.MatchNode;
// The first (and may be the only) run can contain text before the match,
// in this case it is necessary to split the run.
if (e.MatchOffset > 0)
currentNode = SplitRun((Run)currentNode, e.MatchOffset);
// This array is used to store all nodes of the match for further removing.
ArrayList runs = new ArrayList();
// Find all runs that contain parts of the match string.
int remainingLength = e.Match.Value.Length;
while (
(remainingLength > 0) &&
(currentNode != null) &&
(currentNode.GetText().Length <= remainingLength))
{
runs.Add(currentNode);
remainingLength = remainingLength - currentNode.GetText().Length;
// Select the next Run node.
// Have to loop because there could be other nodes such as BookmarkStart etc.
do
{
currentNode = currentNode.NextSibling;
}
while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
}
// Split the last run that contains the match if there is any text left.
if ((currentNode != null) && (remainingLength > 0))
{
SplitRun((Run)currentNode, remainingLength);
runs.Add(currentNode);
}
if (builder == null)
builder = new DocumentBuilder(e.MatchNode.Document as Document);
builder.MoveTo((Run)runs[0]);
// Specify font formatting for the hyperlink.
builder.Font.Color = Color.Blue;
builder.Font.Underline = Underline.Single;
// Insert the link.
builder.InsertHyperlink(displayText, hyperlink, false);
// Revert to default formatting.
builder.Font.ClearFormatting();
foreach (Run run in runs)
{
run.Remove();
}
// Signal to the replace engine to do nothing because we have already done all what we wanted.
return ReplaceAction.Skip;
}
private static Run SplitRun(Run run, int position)
{
Run afterRun = (Run)run.Clone(true);
afterRun.Text = run.Text.Substring(position);
run.Text = run.Text.Substring(0, position);
run.ParentNode.InsertAfter(afterRun, run);
return afterRun;
}
}
Hi,
Thank you for the reply, that really helped a lot. I’ve got one more question however. Can’t I use both Aspose.words and Aspose.pdf over the same project? What’s happening is, I can use Aspose.words.Document but not Aspose.pdf.Document. The option for Document doesn’t appear after Aspose.Pdf.
Thanks.
Hi Bikram,
Thanks for your inquiry. You can use Aspose.Pdf and Aspose.Words in same project. Could you please share some more detail about your query along with sample code example? We will then provide you more information about your query along with code.