Link annotation cannot be parsed

truongminhlong · February 20, 2017, 10:25pm

Hi Aspose,
When I loaded and processed attached pdf using Aspose.Pdf_17.2.0.
I attached sample.zip and this worked fine with Acrobat Reader.
“metadefender-booklet.pdf”'s page 9 contains 4 links and these links can be clicked in output file by using Acrobat Reader.
My sample code:
public static void TestHyperlink(String input, String output)
{
File.Copy(input, output, true);
Document pdfDocument = new Document(output);
pdfDocument.Form.Type = FormType.Standard;
PageCollection pages = pdfDocument.Pages;
pdfDocument.Flatten();
int numOfHyperLink = 0;
int numOfAnnotation = 0;
foreach (Page p in pages)
{
if (p == null)
{
continue;
}
foreach (Annotation anno in p.Annotations)
{
if (anno.AnnotationType == AnnotationType.Link)
{
LinkAnnotation linkAnno = (LinkAnnotation)anno;
if (linkAnno.Action is GoToURIAction)
{
numOfHyperLink++;

// specify the URI to disable hyperlink
GoToURIAction goToAction = (GoToURIAction)linkAnno.Action;
goToAction.URI = “http://#”;
}
}
else
{
p.Annotations.Delete(anno);
numOfAnnotation++;
}
}
}
}
Thanks you

asad.ali · February 21, 2017, 10:27am

Hi Long,

Thanks for contacting support.

I have tested the whole scenario with the input file(s) which you have shared and my findings are as follows:

Regarding the code snippet which you have shared, I have noticed that you are flattening the PDF document which means you are making content (i.e forms, annotations, etc) a part of actual PDF. Please note that by flattening the content the PDF will still look the same but user can no longer interact with that. That is why the Annotations Collection of the page always shows the zero count after flattening the document.
When I processed “link_test1.pdf ” and “link_test2.pdf ” without flattening document, the code of modifying the GoToURIAction executed just fine and generated output(s) with modified link annotations. I have attached “link_test2_processed.pdf” for your reference.
I also have tested the file named “metadefender-booklet.pdf ” and noticed that it did not contain any annotation inside it. The links on Page 9 are just text. Please note that LinkAnnotations do not have text that is why we place text under the annotation to make user feel that this is a link.
Moreover when we place a text in the PDF (e.g www.google.com ) it automatically renders as clickable into PDF. To change the behavior of that type of link we need to replace it (i.e just write google.com ). I have processed the “metadefender-booklet.pdf ” and replaced a text “www.opswat.com ” with “opswat.com ” by following code snippet. I have also attached the file for your reference.

Document pdfDocument = new Document(“metadefender-booklet.pdf”);
PageCollection pages = pdfDocument.Pages;
TextFragmentAbsorber filter = new TextFragmentAbsorber("www.opswat.com");
pages.Accept(filter);
foreach(TextFragment text in filter.TextFragments)
{
 text.Text = text.Text.Replace("www.", "");
}
pdfDocument.Save("metadefender-booklet_modified.pdf");

In case of further assistance please feel free to contact us.

Best Regards,

truongminhlong · March 5, 2017, 9:21pm

Thanks you for you reply