How to Remove Hyperlinks from Text in a PDF document?

oren10280 · May 11, 2015, 2:35am

Hi
Is there a way to iterate text segment and disable hyperlink in PDF document?

Thanks

tilal.ahmad · May 11, 2015, 11:52pm

Hi Oren,

Thanks for your inquiry. You can iterate through the LinkAnnotaions and update the different properties as per your need. You can set an empty URI to disable the LinkAnnotaiton.

Document doc = new Document("input.pdf");

// Get the first link annotation from the first page of the document
LinkAnnotation linkAnnot = (LinkAnnotation)doc.Pages[1].Annotations[1];

if (linkAnnot.Action is GoToURIAction)
{
    // Modify the link: change link URI
    GoToURIAction goToAction = (GoToURIAction)linkAnnot.Action;

    // Specify the URI for the link object
    goToAction.URI = "";

    // Save the document with the updated link

    // Search the text under the annotation
    TextFragmentAbsorber ta = new TextFragmentAbsorber();
    Aspose.Pdf.Rectangle rect = linkAnnot.Rect;
    rect.LLX -= 10;
    rect.LLY -= 10;
    rect.URX += 10;
    rect.URY += 10;
    ta.TextSearchOptions = new TextSearchOptions(rect);
    ta.Visit(doc.Pages[1]);

    // Change color and text.
    foreach (TextFragment tf in ta.TextFragments)
    {
        tf.TextState.ForegroundColor = Aspose.Pdf.Color.Red;
        tf.Text = "Click Here";
    }
}

doc.Save("output.pdf");

Please contact us for any further assistance.

Best Regards,

oren10280 · May 13, 2015, 6:07am

Hi
Thank you very much for the example.
How can I disable links that are generated from a text such as:
http://edition.cnn.com . If the text has a pattern of link (e.g. http, www…) the PDF Acrobat link the text.

Thanks

tilal.ahmad · May 14, 2015, 1:04am

Hi Oren,

Thanks for your feedback. I noticed that Adobe automatically link the text with specified (http,www) pattern. After initial investigation we have logged a ticket PDFNEWNET-38688 in our issue tracking system for further investigation and resolution. We will notify you as soon as we made some progress towards issue resolution progress.

We are sorry for the inconvenience caused.

Best Regards,

camilm · May 20, 2015, 6:37pm

any update on this issue?

tilal.ahmad · May 21, 2015, 11:19am

Hi Camille,

Thanks for your inquiry. I am afraid issue is still not resolved. As we have recently noticed the issue and it is still pending for investigation due to other issues, already under investigation and resolution. We will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,

vinhlam · April 4, 2016, 3:30pm

Hello,

is there any news on this issue?

Thanks,

tilal.ahmad · April 5, 2016, 1:43am

Hi Camille,

Thanks for your inquiry. Please note we cannot delete hyperlinks from such text, because there are no any links. Viewer (Adobe or Foxit) converts a text that matches to a link and e-mail format and makes it active automatically. However as a workaround we can override the default link by our LinkAnnotation object configured with empty GoToAction. Please try following sample code snippet, hopefully it will help you to accomplish the task.

// load the PDF file
Document doc = new Document(myDir+"Hyperlink.pdf");

//create TextAbsorber object to find
//all the phrases matching the regular expression for hyperlink of e-mail
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("(\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|])|((([a-zA-Z]|[0-9])|([-]|[_]|[.]))+[@](([a-zA-Z0-9])|([-])){2,63}[.](([a-zA-Z0-9]){2,63})+$)");

//set text search option to specify
//regular expression usage
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions=textSearchOptions;

int pageNum = 1;
//accept the absorber for all the pages
Aspose.Pdf.Page page =
  doc.Pages[pageNum];
page.Accept(textFragmentAbsorber);

//get the extracted text fragments
TextFragmentCollection textFragmentCollection =
  textFragmentAbsorber.TextFragments;

//loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
    LinkAnnotation linkAnnot = new LinkAnnotation(page, textFragment.Rectangle);
    linkAnnot.Border=new Border(linkAnnot);
    linkAnnot.Border.Width=0;
    page.Annotations.Add(linkAnnot);
}

doc.Save(myDir+"HyperlinkRemoved.pdf");

Best Regards,