Page Numbering Problem (Doc to PDF)

JerryZery · December 10, 2007, 10:57am

Hi,

We are using Aspose.PDF (v 3.6.1) and Aspose.Words(4.4.2) to convert Word documents to PDF. Our application is build around the page numbering in the documents. So, after converting the original DOC file to PDF it saves the page numbers of each items into DB.
But when converting Word document to PDF, aspose does it in less pages. In other words, in the attached example of word and converted pdf files, DOC file hase 26 pages and PDF file has 25 pages. And because of that when our application tries to acces 26th page of converted PDF file and page is not there, it throws an exception. Is there any way we can convert DOC files to PDF with exact same pages and page numbers?
Thanks in advance,
Jerry

Klepus · December 10, 2007, 12:49pm

Hello Jerry.
Thank you for asking this.
It’s technically impossible to guarantee PDF after conversion to be 100% equivalent to the source Word document. The disparities can be determined by difference in font typefaces in MS Word and Adobe Acrobat and other things. So in general you shouldn’t rely on page numbers and page layout. If you can tell us the reason why you need these assumptions we could suggest anything alternative.
Thank you,

JerryZery · December 12, 2007, 9:49am

Hi Sorry for delay in the reply.
The source word file is being created by our application on the Fly and every paragraph in the file is coming from DB. We give our users ability to click on a item and our application jumps to the page number of that item in word/pdf file. Now, when converted to pdf file using aspose, the page numbers of all of these items are misplaced.
I think the major difference is coming from font size (they look little bigger in word file) and line spacing (again little bigger in original word files). If you can fix those two things, I think we will be OK.
Thanks,
Jerry

Klepus · December 12, 2007, 2:04pm

Hi Jerry.
Thank your for asking this.
In general we cannot achieve PDF 100% exact with source MS Word document because it is technically impossible. As you noted fonts are rendered differently in MS Word and Adobe Acrobat. This is because typefaces are different too.
As I figured out you programmatically position some viewer on the page you intend to contain some known document items. Okay. Would you please describe us what software you are using and how you are choosing pages. Maybe PDF bookmarks could be used to navigate in PDF? We implemented bookmark propagation to segment IDs several releases ago. And Aspose.Pdf declares the ability to determine what page contains a particular segment. Have you tried using this facility? Just don’t assume that things are on the same page in DOC and PDF. Let me know further details.
Regards,