Insert hyperlink doesnt work when doc is converted to pdf - aspose.words

We are using below code to insert hyper link after every paragraph . But once the document is converted to pdf, hyper link isnt working.

foreach (Paragraph paragraph in doc.GetChildNodes(NodeType.Paragraph, true))
{
    int index;

    if (paragraph.GetAncestor(NodeType.Table) != null)
        continue;

    string paraText = paragraph.ToString(SaveFormat.Text).Trim();

    // Check if the paragraph text matches the section pattern
    if (Regex.IsMatch(paraText, @"^\d+(\.\d{1})+\s*"))
    {
        // Apply the desired style to the paragraph without numbering
        paragraph.ParagraphFormat.StyleName = "Heading 11";

        //get index of current paragraph and insert Back to TOC Hyperlink before the heading
        index = paragraph.Document.GetChildNodes(NodeType.Paragraph, true).IndexOf(paragraph);

        builder.MoveToParagraph(index + 1, -1);

        builder.Font.StyleIdentifier = StyleIdentifier.Hyperlink;
        //builder.InsertField("HYPERLINK \"#TOC\" \\l \"TOC\"", "Back to TOC");

        builder.InsertHyperlink("Back to TOC", tokBkName, true);


    }

@hrnallap

To ensure that hyperlinks work correctly after converting a document to PDF using Aspose.Words, you should follow these guidelines:

  1. Insert Hyperlinks Correctly: When inserting hyperlinks, make sure to use the InsertHyperlink method properly. The syntax should look like this:

    builder.InsertHyperlink("Link Text", "http://example.com", true);
    

    The third parameter indicates whether the hyperlink should be displayed as a hyperlink in the document.

  2. Update Fields: After inserting hyperlinks, it is essential to update the fields in the document before saving it as a PDF. You can do this by calling:

    doc.UpdateFields();
    
  3. Save as PDF: When saving the document to PDF, use the following code:

    doc.Save("output.pdf", SaveFormat.Pdf);
    
  4. Check PDF Viewer: Ensure that the PDF viewer you are using supports hyperlinks. Some viewers may not display hyperlinks correctly.

Here is a complete example of how to insert hyperlinks and save the document as a PDF:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

// Insert a hyperlink
builder.InsertHyperlink("Back to TOC", "http://example.com", true);

// Update fields
doc.UpdateFields();

// Save the document as PDF
doc.Save("output.pdf", SaveFormat.Pdf);

By following these steps, you should be able to create a PDF document with functional hyperlinks. If you continue to experience issues, please check the specific PDF viewer settings or consider testing with different viewers.

@hrnallap Could you please attach your input and problematic output documents here for testing? We will check the issue and provide you more information. Unfortunately, screenshots does not give enough information to analyze the problem.

Using above code to add Back to TOC hyperlink but when converted to PDF it doesnt work.
1273-02-0042-001.pdf (442.0 KB)

@hrnallap Most likely the bookmark is not present in the document. The following code works fine on my side:

Document doc = new Document(@"C:\Temp\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);

// Insert bookmark at the begging of TOC.
FieldToc toc = (FieldToc)doc.Range.Fields.Where(f => f.Type == FieldType.FieldTOC).FirstOrDefault();
string tokBkName = "TOC_Bookmark";
if (toc != null)
{
    builder.MoveToField(toc, false);
    builder.StartBookmark(tokBkName);
    builder.EndBookmark(tokBkName);

    // Add "Back to TOC" hyperlinks.
    foreach (Paragraph paragraph in doc.GetChildNodes(NodeType.Paragraph, true))
    {
        if (paragraph.GetAncestor(NodeType.Table) != null)
            continue;

        string paraText = paragraph.ToString(SaveFormat.Text).Trim();

        // Check if the paragraph text matches the section pattern
        if (Regex.IsMatch(paraText, @"^\d+(\.\d{1})+\s*"))
        {
            Console.WriteLine("test");
            builder.MoveTo(paragraph);


            builder.Font.StyleIdentifier = StyleIdentifier.Hyperlink;
            builder.InsertHyperlink("Back to TOC", tokBkName, true);
        }
    }
}

doc.Save(@"C:\Temp\out.docx");
doc.Save(@"C:\Temp\out.pdf");

So the steps I am trying are - I use the code to add Bookmark and add Back to TOC that takes me to bookmark. This code saves the document as .docx.

Back to TOC works fine in .docx. Then I open the document and convert to PDF by using Word Save as option. When its converted to PDF, Back to TOC hyperlink isnt working. Please try this and let me know if you are able to see the hyperlink in PDF format.

If the bookmark isnt present then it shouldnt work in .docx format too right ? But its working fine in .docx format. Only in PDF it isnt working.

Does builder.InsertHyperlink work only for .Docx and not PDF? what is the PDF equivalent?

@hrnallap The above provided code works for both DOCX and PDF on my side. When bookmark is not present in the document MS Word simply jumps to the beginning of the document. Most likely this is what you are observing.

Please attach your input and output DOCX and PDF documents. We will check them and provide you more information.

Attaching my input file and output file . You can see that back to TOC doesnt work in Output which is converted to PDF using above code. Attaching my code again -

private static void UpdateParagraphStyle(Document doc)
{

    DocumentBuilder builder = new DocumentBuilder(doc);
    // Insert bookmark at the begging of TOC.
    FieldToc toc = (FieldToc)doc.Range.Fields.Where(f => f.Type == FieldType.FieldTOC).FirstOrDefault();
    string tokBkName = "TOC_Bookmark";

    if (toc != null)
    {
        builder.MoveToField(toc, false);
        builder.StartBookmark(tokBkName);
        builder.EndBookmark(tokBkName);
    }

    foreach (Paragraph paragraph in doc.GetChildNodes(NodeType.Paragraph, true))
    {
        int index;

        if (paragraph.GetAncestor(NodeType.Table) != null)
            continue;

        string paraText = paragraph.ToString(SaveFormat.Text).Trim();

        // Check if the paragraph text matches the section pattern
        if (Regex.IsMatch(paraText, @"^\d+(\.\d{1})+\s*"))
        {
            // Apply the desired style to the paragraph without numbering
            paragraph.ParagraphFormat.StyleName = "Heading 11";

            //get index of current paragraph and insert Back to TOC Hyperlink before the heading
            index = paragraph.Document.GetChildNodes(NodeType.Paragraph, true).IndexOf(paragraph);

            builder.MoveToParagraph(index + 1, -1);

            builder.Font.StyleIdentifier = StyleIdentifier.Hyperlink;
            //builder.InsertField("HYPERLINK \"#TOC\" \\l \"TOC\"", "Back to TOC");

            builder.InsertHyperlink("Back to TOC", tokBkName, true);

        }

Back to TOC Input.docx (454.2 KB)

Back to TOC Output.pdf (637.9 KB)

@hrnallap The bookmark name that is inserted at the beginning of the document is TOC_Bookmark, but the name of bookmark the hyperlink is pointing to is TOC. So there is no bookmark in the output document where the hyperlink is pointing to. Please modify your code that inserts hyperlink like this:

builder.InsertHyperlink("Back to TOC", tokBkName, true);

Hi, if you observe my code the line is already there. I am not sure where else you would want me to insert it? Also if bookmark is not found, why is it working in .docx and not in PDF?

image.png (43.0 KB)

These bookmarks dont work in PDF?

@hrnallap Could you please attach DOCX output document produced on your side? As I have mentioned the problem is not reproducible on my side. The links works fine in both DOCX and PDF produced on my side.

Please find attached .docx and .pdf that got generated from my code. Bookmarks work in .docx but not in .pdf :frowning:
Back to TOC.docx (93.7 KB)

Back to TOC.pdf (199.6 KB)

@hrnallap There is no bookmark in DOCX, the hyperlink simply jumps to the beginning of the document (the default behavior). So, please make sure the code that inserts bookmark before the TOC gets executed.

So should I move the builder to start of document and then insert hyperlink as below ?

builder.InsertHyperlink(“Back to TOC”, tokBkName, true);

@hrnallap The hyperlinks are there, but there is no bookmark these hyperlinks points to. it looks like the following code is not executed or the inserted bookmakr is removed by further processing of the document:

string tokBkName = "TOC_Bookmark";

if (toc != null)
{
    builder.MoveToField(toc, false);
    builder.StartBookmark(tokBkName);
    builder.EndBookmark(tokBkName);
}

I have provided you .docx and pdf of the same transformed document. If that is the case, then why the bookmarks work in .docx and not in pdf? Both the same formats are getting saved at the same time post processing.

I feel like these bookmarks that ASPOSE is adding is meant for .docx only and not for PDF?

@hrnallap The bookmark does not work in both DOCX and PDF because the bookmark does not exist in the documents. When bookmark does not exist in MS Word document, when you click on the hyperlink MS Word jumps to the beginning of the document. This is default behavior of MS Word when bookmark does not exist.

I am using below code to insert a bookmark. I want to insert bookmark at the beginning of the document. What changes do I need to do ?

DocumentBuilder builder = new DocumentBuilder(doc);
// Insert bookmark at the begining of TOC.
FieldToc toc = (FieldToc)doc.Range.Fields.Where(f => f.Type == FieldType.FieldTOC).FirstOrDefault();
string tokBkName = "TOC_Bookmark";

if (toc != null)
{
    builder.MoveToField(toc, false);
    builder.StartBookmark(tokBkName);
    builder.EndBookmark(tokBkName);

    builder.InsertHyperlink("Back to TOC", tokBkName, true);

}