Word to PDF conversion errors

I have created this thread to record the errors that I encounter when converting Word documents to PDF documents.

Although the Aspose components do an excellent job of conversion, unfortunately it has to be perfect for the application that I am developing. The nature of the documents that I am converting push Word to the limits of what it should really do and therefore I am seeing minor errors in most of the documents that I am converting.

The attached document shows the following errors when using Aspose.Words v3.5.1.0 and Aspose.PDF v2.9.3.0.

1) The "Missing Payments" and "Important - Read this carefully..." headings are misaligned compared to the "Your home may be repossesed" heading. I can see that this is because they have different styles. Currently I can fix problems like this when the document is in my control but once the software is deployed I will no longer be able to do this as the documents will be maintained by our end-users.

2) The image before the "Important - Use of your imformation" is placed on a separate line. This is a known issue with the Aspose.PDF component.

3) The checkbox following the "Marketing and Market Research" paragraph is missing.

4) The footer is displayed on even numbered pages in the PDF document but odd numbered pages in the Word document.

5) The image in the footer is displayed on a separate line (I may be able to get away with this because it actually looks better that way). This appears to be the same as issue 2.

6) When the document is printed from Word the "watermark" text in the bottom right hand corner does not appear. However it appears in the pdf document in normal font.

The attached document shows the following errors when using Aspose.Words v3.5.1.0 and Aspose.PDF v2.9.3.0. This document is a companion to the above document and therefore shows all the same errors. In addition, it also contains the following error

1) The signature boxes are missing from the generated pdf document. I believe this has something to do with them being placed within floating textboxes.

Thanks for reporting these issues to us. I will check them and let you know of the results.

The attached document shows the following errors when using Aspose.Words v3.5.1.0 and Aspose.PDF v2.9.3.0. This document is again a companion to the above documents and therefore shows similar errors. In addition, it also contains the following errors

PAGE 1 Errors

1) The company logo image after the inital heading appears on a separate line. This is a known issue with Aspose.PDF.

2) The box contained within the middle of the text "Interest rate fixed for..." appears on a separate line before the line containing the text.

3) The box containing the interest rate proceeding the "Per annum and thereafter..." text displays incorrectly on a separate line.

4) The text "Per annum and thereafter..." appears incorrectly.

5) Headings that are aligned in the Word document are no longer aligned in the PDF document ("LOAN DETAILS...", "Total loan amount...", "Variations").

6) The Yes/No checkboxes in the "Optional Payment Protection Insurance" section have been stretched vertically.

7) In the "BORROWERS' DECLARATION..." section, the "You agree and declare that" text in misaligned.

8) In the same section, the text content of the list is not aligned in sections 2, 3 and 4 when the text spans multiple lines.

9) Although the lock image in item 6 of the above list is correctly placed, the resolution seems to have been significantly decreased so that the image is extremely pixellated.

10) The checkbox at the end of the paragraph in item 6 is missing.

PAGE 2 Errors

11) The "TERMS AND CONDITIONS" header appears in the wrong place.

12) The "Important - You should read..." header contains the characters "#$NP" immediately before it.

13) All the headings are misaligned with the text following them.

Concerning Regulated_AdvanceCopy.dot

1) The defect is logged as issue #839.
2) Aspose.Pdf defect. Already logged as issue #824.
3) Aspose.Words defect. Logged as issue #840.
4) Not a footer but floating image which is placed incorrectly. The defect is logged as issue #841.
5) Same as (2). Already logged as issue #824.
6) Not a defect. The "watermark" is a footer made with a regular font. The reason it does not print sometimes from MS Word is maybe because it goes outside the printable area of the printer.

Concerning Regulated_SignatureCopy.dot

Ungroup signature shapes to avoid the problem. Grouped images handling is currently not supported by Aspose.Words.

Concerning Unregulated.dot

1) Same Aspose.Pdf defect. Already logged as issue #824.
2) Inline table in text is placed incorrectly. Logged as issue #842.
3) Floating text boxes are placed incorrectly. Logged as issue #843.
4) Same as (3)
5) Incorrect alignment of headings. Logged as issue #844.
6) Texboxes are stretched verically. Logged as issue #845.
7) Same as (5).
8) I agree that the conversion is incorrect but this problem can be avoided by setting the same formatting and tabstops for all members of the list. I have logged this as issue #846.
9) Image is pixellated. Logged as issue #847.
10) Aspose.Words defect. Logged as issue #840.
11) Same as (3). Logged as issue #843.
12) Invalid characters in converted document. Aspose.Words problem. Logged as issue #848.
13) Same as (3). Logged as #843.

We will deal with thes issues as soon as our schedule permits. We will also coordinate our efforts with Aspose.Pdf team. Please post a message in their forum referring to this thread.

I am attaching the file with conversion results here for conveniency of our future analysis and research.

Here are some intermediary results for logged defects:

  1. Defect #839 "Heading alignment is incorrect": fixed in Aspose.Pdf 2.9.4.
  2. Defect #840 "Checkbox is not exported":

    The checkbox following the "Marketing and Market Research" paragraph is actually a rectangle autoshape. Autoshapes are not exported to PDF. You need to turn it into something else. The easiest option is to insert a Wingding symbol (symbol codes 0x6F-0x72).
  3. Defect #841 "Floating text box is shifted to another page":

    It is a problem in Aspose.Pdf. We've notified Aspose.Pdf team about this.

    For a workaround use a footer instead of a textbox. Or move the anchor up one paragraph. At the moment the textbox is anchored to the last paragraph on the page that contains the page break character and that resulsts in Aspose.Pdf drawing the textbox on the next page.
  4. Defect #842 "Inline table in text is placed incorrectly":

    This is a known issue #125. Absolutely positioned tables are not supported when exporting to PDF yet. Use a textbox instead or a "table grid".
  5. Defect #843 "Texboxes "Per annum and thereafter...", "TERMS AND CONDITIONS" and some others are misplaced":

    The textboxes are misplaced because they are anchored to a paragraph inside a table. We've reported this to Aspose.Pdf team, but you can workaround this problem. For a example you can create a cell (more than one cell probably) in the table and use it to arrange the text in this case.
  6. Defect #844 "Incorrect alignment of headings ("LOAN DETAILS...", "Total loan amount...", "Variations")":

    Fixed in Aspose.Pdf 2.9.4.
  7. Defect #845 "Texboxes are stretched verically":

    These boxes are textboxes and they contain a paragraph break character. Aspose.Pdf seems to increase the height of the textboxes to fit the font of the paragraph break. You can try reducing the font inside these boxes. But better, replace them with a Windging character that looks like a box.
  8. Defect #846 "Incorrect alignment in "BORROWERS' DECLARATION..." for sections 2, 3 and 4":

    Seems to work correctly with Aspose.Words 3.5.2 and Aspose.Pdf 2.9.5.

Simple inline images are now supported in Aspose.PDF v3.0.0.0. Do you have a timescale for when this feature will be supported in the Aspose.Word for Word to PDF conversion?

We will look into this feature and maybe add support for it in the next release of Aspose.Words in a week or so. You will be notified of the results here in this thread.

Issue #848 “Invalid characters in converted document” does not display itself with the latest Aspose.Pdf 3.0 and Aspose.Words 3.6.