How to find page number of Node using Java

If anyone could help, I am not sure if this is even possible, but I am searching the document for some specific text and need to return the page number that this text is on. Is this possible? Or are things like page numbers calculated after the actual document is created.

Hi Eddie,

Thanks for your query. Please use the following code snippet for your requirement. Hope this helps you. Please let us know if you have any more queries.

Document doc = new Document(MYDir + "in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
builder.moveToDocumentEnd();
Field page = builder.insertField("PAGE", "");
page.update();
System.out.println(page.getResult());

Thank you. This is exactly what I am looking for.

As a follow up question, I have been testing with this and noticed that it returns the wrong page numbers about ~25% of the time, most often reporting page 1 when it is something higher. Would you have any clue why this is? Is it because the page.update() does an educated guess? It should be noted that I am inserting this field into a Run node, not sure if that has any play into why I am sometimes getting the wrong answer. Thanks again!

Hi Eddie,

Thanks for your inquiry.

In addition to Tahir’s answer, please note that page numbers are represented by a PAGE field in MS Word documents and this field is related to the page layout algorithms in Aspose.Words. When you open a document with MS Word, it calculates the numbers on the fly. The code suggested by Tahir is correct; however, you may want to remove the newly inserted Page field from your document by using the following two lines at the end:

// Remove PAGE field.
page.Remove();

Best Regards,

Thanks for further information on this. So just to clarify, it sounds like the reason this is sometimes returning an incorrect page number is due to the Aspose.Words page layout algorithm calculating an approximation? It just seems unusual, because when it is wrong, it usually returns specifically a page number of one. Anything other than one and the algorithm is always right from what I have seen.

Hi Eddie,

Thanks for your inquiry. I would like to share with you that MS Word document is flow document. It means that it does not contain any information about its layout into lines and pages. Pages are created by MS Word on the fly and Aspose.Words
uses our own Rendering Engine to layout documents into pages.

It would be great if you please share your document for which you are getting wrong page number by using DocumentBuilder.insertField method for investigation purposes.

Hi Eddie,

Thanks for your inquiry.

The Aspose.Words rendering engine attempts to renders documents exactly as they appear in Microsoft Word.

The issue you are having can occur if the section has a setting enabled to restart page numbering. For example you can set a section to start at page 3 instead of page 1, this would consequently cause all of the numbering will be off.

Please try using the PageNumberFinder class attached to this thread instead and see you how go. This class knows how to temporarily work around any page numbering restarting to get the correct page number of a node.

Thanks,

The issues you have found earlier (filed as WORDSNET-2978) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(64)