We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

PDF Content and Pages Numbers

I am wanting to parse the content of a PDF file and create bookmarks on particular pages. Can I seach for text in a PDF and return the page number the found text is on?

Think I have found my own answer.


Getting the text a page at a time. If there is a better way happy to know it.

Cheers guys.

Hello Shane,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for considering Aspose.

Aspose.Pdf.Kit for Java has a class named PdfSearcher which offers the capability to search the particular text in a rectangle, but I am afraid it lacks the capability to return the page number over which the text is found. You can accomplish the programmatically, using setStartPage and setEndPage and searching for certain text pattern using searchTextInRectangle.

FYI: PdfExtractor is a class which can be used to extract the text or Image contents from the Pdf file, as a whole, and in order to search for a particular text string, you would have to parse the document contents programmatically and search for the text pattern by yourself, but this method will not help out, in retrieving the page number.

Thanks for that Nayyer.

Have gone with PDF Extractor. Using the GetNextPageText method I can keep track of the page numbers myself so I am able to write the correct bookmarks when I get the search match.

Got it working about 10 minutes ago.

Thanks for the quick response. You guys rock.