Free Support Forum - aspose.com

Word to PDF convertion with quotation mark converted to unicode

I have found Aspose.Words for some reason generates an entire paragraph as Unicode if a quotation mark is used, this causes PDF's to be created slowly and increases the file size also the font gets added twice to the PDF document which appears as:

ArialMT

ArialMT (Embedded Subset)

Is there any way to force quotation marks to be shown normally or somehow override the IsUnicode attribute since it’s not needed in this situation?

<Segment IsUnicode="true" FontName="Arial" FontSize="12">Other then that its an excellent bit of software.

I also noticed even a blank word document adds 'TimesNewRomanPSMT' as a font to the PDF why does this occur?

Let me know if you want an example document and sourcecode, i have attached a XML file hopfully thats enough.

Hi,

Thank you for considering Aspose. The Xml is generated by Aspose.Words so I move this post to Aspose.Words forum.

Hello Chris.

Thank you for reporting this. I'll check the logic in Aspose.Words export engine.

Regards,

Hello!

MS Word replaces some characters as we type with others. For instance simple apostrophe (') with code 39 can be replaced with either (‘) or (’). Their codes are 8216 and 8217 respectively. In general we can not decide whether Unicode is needed in Aspose.Pdf to render these characters with current font. So we mark the text run containing them with IsUnicode. This is by design.

Regards,

1) I also noticed even a blank word document adds 'TimesNewRomanPSMT' as a font to the PDF why does this occur?

2) I see what your saying Klepus but wouldnt it make more sense if aspose words created the xml with the least amount of unicode as possible so previously:

<Segment IsUnicode="true" FontName="Arial" FontSize="12">Other then that its an excellent bit of software.

Becomes, otherwise the entire paragraph becomes unicode not just the intended special charector.

Other then that it

s an excellent bit of software.

The speed diffrence is dramatic on a big document, try it for yourself.

Hi Chris.

We don’t output 'TimesNewRomanPSMT' to the XML if it is not mentioned anywhere.

If you found performance degrade then we should check it on the real document. Please attach it to the thread so we could reproduce these two problems. They may be quite specific.

Thank you,