Process Word and PDF documents using Aspose.PDF and Aspose.Words .NET - dash gets broken down

Hello Aspose,

i got an astonishing issue when i try to convert a word document, created with Aspose.Words, to a PDF using Aspose.PDF (DLL-Version 3.1.3.0).

For converting, i use this code, described in my later post about an page break issue: http://www.aspose.com/Community/forums/54261/ShowThread.aspx#54261

If the word document includes two words seperated by a dash, the format of both words and the dash gets broken down.

If i detach both words or replace the dash, the format is fine.

http://zeta-software.de/ZetaUploader/Data/c74500b945934ce4a986d7f197400da8.zip

The EN DASH unicode symbol that you use in this case seem to be rendered incorrectly by Aspose.PDF. To workaround you can replace it with a normal hyphen (minus) sign.

As it is Aspose.Pdf problem I am moving this post to Aspose.Pdf forum. They will respond you shortly on this issue.

Thanks for consideing Aspose.

We will step further to check out reasons and give you a reply ASAP.

Any progress on this issue yet?

We are having similar problems.

It seems to be connected with any characters on which MS Word performs special formatting; that is, the EM DASH, the EN DASH, double-quotes, and apostrophes. Apparently, Word stores the formatted characters as some kind of special characters. When these characters appear in a "run", the font and font-size of the entire run revert to default values.

I do understand that you can disable this "character formatting" functionality in Word, but our problem is that we are dealing with Word documents created by other parties, hence we have no control over what special characters that these documents might contain.

Thank you.

We have not resolved this issue. Can you please provide an example and let us check if it is the same problem as the one in the upper post?

Attached is an example (FontChange_SpecialChars.doc). This is a Word document, which, as the text of its 1st paragraph describes, has a default font of Verdana. This is followed by several paragraphs, each containing a different special character, and a final summary paragraph. The font of each paragraph is Verdana.

When this document is converted to a PDF document using Aspose Words (ver 3.7.0.0) & PDF (ver 3.1.7.1) using the "standard" routine, then in the resulting PDF document, each of the paragraphs which contain a special character the font has been changed to Times New Roman. When you compare the input Word document to output PDF document, you can see the differences.

Hi,

The attached PDF document is created by Aspose.Words v 4.0.0.0 and Aspose.PDF v3.1.7.7. As you see, all texts are shown correctly in Verdana including those containing special chars. Could you point me out any errors I missed?

  1. Firstly, I suggest you to use the latest version of Aspose.Words and Aspose.PDF for avoiding unexpected errors.
  2. If the error still exits, secondly please make sure Verdana is correctly installed in System, or change to use another font in order to see if you get correct answer, if the answer is positive, Verdana is not installed correctly. If the answer is negative,
  3. Please provide Verdana that used, we will investigate it and respond you shortly on this issue.