I have one existing pdf file with Index at the start.
Index is in a table format. It has Ref number, Name and Page number.
Below is the example how my Index File looks like.
I want to add HyperLink on the name of each entry in Index which redirects to respective page number.
Basically I want to find that particular word(which may contain space) in the pdf file and add HyperLink in that.
Is this possible with Aspose.Pdf?
Yes, your requirements are possible to achieve using Aspose.PDF. You can search your target text using TextFragmentAbsorber Class and add local hyperlink to the found text fragment as per your needs. In case you face any issue, please share your sample PDF document with us. We will test the scenario in our environment and address it accordingly.
Thanks for your quick response.
I tried the links you provided and it works like a charm. But it can not find the sentence if there is a line break in it.
I am attaching my sample pdf file in which I have added hyperlink to two entries in Index which don’t have line break. But not able to add hyperlink on the first entry of the Index.
We noticed the similar issue at our end while using Aspose.PDF for .NET 21.1. Please note that you need to use regular expressions in order to extract/search multiline text or text with a line break. So we used below code snippet:
Document doc = new Document(dataDir + "DriveToPdf.pdf");
TextFragmentAbsorber absorber = new TextFragmentAbsorber(@"(i?)This\s*is\s*a\s*test\s*document\s*which\s*contains\s*some\s*data\s*click\s*here\s*to\s*go\s*to\s*destination\s*click\b", new TextSearchOptions(true));
doc.Pages[1].Accept(absorber);
if(absorber.TextFragments.Count > 0)
{
foreach(var tf in absorber.TextFragments)
{
Console.WriteLine(tf.Text);
LinkAnnotation link = new LinkAnnotation(tf.Page, tf.Rectangle);
link.Action = new GoToAction(doc.Pages[2]);
tf.Page.Annotations.Add(link);
}
}
doc.Save(dataDir + "output.pdf");
The API was unable to find the text. We used TextAbsorber to see in which format the text was present in the PDF and found that it was extracted as below:
This is a test document that contains some data click here to go to 2
destination click
We tried to change the regular expression accordingly but still did not get much success. Therefore, have logged an issue as PDFNET-49312 in our issue tracking system for further investigation. We will look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.
Can you suggest what is the best way to create a PDF with Index table and linking them internally within the PDF.
Actually our requirements is let’s say we have three separate PDF documents. We are merging this three document into single PDF document and adding Index table at the begging. This index table should link to the respective page number from where the document starts as explained by @tgyogesh
We have 200+ documents to be merged and indexed in this way.
Can you guide the best way to generate index with linking within same pdf using Aspose?
The basic approach is to add a TOC in PDF document. However, if you want to add a custom Index page with your custom formatting, you can do it by adding a table.
In case you have information of page numbers and respective heading in the index which is supposed to be linked, you can achieve your requirement using a table. Please let us know by sharing a sample PDF in case you face any difficulty in implementing your requirements.
Sadly, the linked ticket has not been yet resolved. However, can you please share your document and other details like sample code snippet so that we can create a dedicated ticket for your case and file?