Get Page Number of a Bookmark or Run of Text

I need to get the page number of a piece of text or bookmark within a document. Any suggestions?

Currently it is only possible if you can mark page limits with page or section breaks. Otherwise, a pagination engine is required. We are working on it right now and it is expected to be out in April.

Will the bookmarks and their names carry over into PDF format?

I was thinkin that we could save as PDF and then parse through the PDF file to get the page numbers from that, but the bookmark naming convention would have to carry over into the PDF and there would have to be a way to get the bookmarks from that file.

I don’t think this idea is acceptable. Bookmarks are exported to PDF with a number of limitations:

Bookmarks are anchored to paragraph start, not to exact position in text. If there is more than one bookmark in a paragraph, only the last will appear in the PDF file.

Sorry for disappointing but I think it’s better to wait until the pagination module is completed.

Those limitations are fine for me … I am compiling many documents into a single document and need to record the page number of each of thos sub documents. As long as I can have one bookmark with per paragraph and can find out the page number that is fine. I don’t need to know the exact position in the text or have more than one per paragraph.

Does the bookmark name carry over?

Judging by the code - no, it does not. Also, bookmarks in PDF is a totally different thing than in Word documents. You can consult with Aspose.PDF team if you want to know more.

This poses a serious problem for me as we have a deadline very soon and this is one of the last loose ends that we were counting on. Is there anything “out of the box” we can try.

If those bookmarks convert into anchor points in the PDF couldn’t we use the PDF library to get those anchors?

I have answered your question at this thread. I am not sure if that can help you.

Just wanted to reopen this thread to see if there has been anything new developed to do this. DimitryV said pagination was coming in April of last year?

I noticed that PdfExtractor has the GetNextPageText which I could use if I converted the document to Pdf, but that seems like a lot of overhead. Is there a way to grab the text on a Word doc page directly?

Hi

Thanks for your inquiry. Unfortunately pagination engine is not ready yet. Hopefully this feature will be supported somewhere in 2009.

Best regards.

Cool, thanks for the quick response. Any idea when in 2009? It’s something I’m looking forward to a lot.

Hi

Thanks for your inquiry. I can’t tell you exact date when this feature will be supported. Hopefully in Q1 2009. But I can’t promise anything.

Best regards.

I’m reopening this thread too, because something that interest me :
I’m using another component for now just to get those bookmarks page number.

It was supposed to be delivered in Q1 2009, where are we in Q2 2010 ?

Regards

Hi

Thank you for reopening this thread. There is great progress on a way of achieving such feature. As you may know, we already implemented our Rendering Engine. It allows converting document to PDF, XPS and images. Internally we already can determine page where some particular node is placed. But there is no public API for achieving the same yet.

Your request has been linked to the appropriate issue. We will notify you as soon as this feature is available in public API.

May I know why you need to determine number of page, where particular node or bookmark is placed?

Best regards.

Absolutely :

We’re converting a word document to PDF.
Then, we modify that PDF. We have to insert other PDF documents, between some bookmarks.
Therefore, we’re searching for the pages they point to, and we merge the other PDF there.

Hi

Thank you for additional information. In this case, you actually do not need determining on which page in MS Word document bookmark is located. After converting word document to PDF bookmarks are preserved. Since PDF format is fixed page format it is easy to determine on which page bookmark is located. You can use Aspose.Pdf.Kit to determine this:

https://reference.aspose.com/words/net/aspose.words.layout/layoutcollector

Best regards.

The issues you have found earlier (filed as WORDSNET-2978) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(34)