Aspose.Words creates much bigger PDFs then Aspose.Word

rex · March 15, 2006, 9:31am

Hello,

I previously posted a related message (https://forum.aspose.com/c/words/8) to the PDF forum that I probably should have posted here.
I am creating a simple one page invoice (using mail merge) that by the time it is converted to a PDF (using Aspose.Words and Aspose.PDF) is 75K. As a test, I had my code simultaneously create .pdf.xml output files from the same mail merge using Aspose.Words and Aspose.Words. The Aspose.Words .xml file became a 9K PDF while the Aspose.Words .xml file became a 76K file. Should I just go back to using Aspose.Words instead of Aspose.Words or are there special parameters in .Words for defining how true type fonts will be handled when creating a .pdf.xml file?

Here are the versions I am using
Aspose.Words (version 2.2.3.0)
Aspose.Words (version 3.5.1.0)
Aspose.PDF (version 2.9.2.0)

In the attached .zip file I have included the mail merge template, the .pdf.xml files created by Words and Word, and the resulting PDF files that were rendered by Aspose.PDF. Thank you for your help.

-rex

DmitryV · March 15, 2006, 1:02pm

Hi,
Thank you for reporting. We are currently discussing the problem with the Aspose.Pdf team. I don’t think reverting to the old version is a good idea. Please give us some time to research.

DmitryV · March 16, 2006, 12:51pm

Sorry Aspose.Pdf team states embedding fonts is required to properly display Unicode characters. I would not recommend going back to the old version of Aspose.Word because it handles some things worse than the newer ones including export of fonts. Is that file size difference critical for you?

rex · March 16, 2006, 2:43pm

We will be saving all invoices so size is critical. Twice as big would probably be OK but seven times as big is not going over too well.

miklovan · March 19, 2006, 5:28am

I have made a procedure that converts document to PDF file, without including TrueType fonts into it. Please mind that using this procedure on files containing unicode texts will lead to corrupted output. So use it at your own risk.

string filename = Application.StartupPath + @"\test1.doc";
Document doc = new Document(filename);
Doc2PdfWithoutUnicode(doc, Application.StartupPath + @"\result.pdf");

private void Doc2PdfWithoutUnicode(Document doc, string pdfname)
{

    MemoryStream stream = new MemoryStream();
    doc.Save(stream, SaveFormat.FormatAsposePdf);
    stream.Seek(0, SeekOrigin.Begin);
    XmlDocument xmlDoc = new XmlDocument();
    xmlDoc.Load(stream);
    // The following procedure switches off unicode support in pdf files.
    // Use it on your own risk.
    System.Xml.XmlNamespaceManager xmlnsManager = new System.Xml.XmlNamespaceManager(xmlDoc.NameTable);
    xmlnsManager.AddNamespace("pdfns", "Aspose.Pdf");
    XmlNodeList nodeList = xmlDoc.SelectNodes(@"//pdfns:Segment[@IsUnicode='true']", xmlnsManager);
    foreach (XmlNode xnode in nodeList)
    {
        xnode.Attributes["IsUnicode"].Value = "false";
    }
    Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();
    pdf.IsImagesInXmlDeleteNeeded = true;
    pdf.BindXML(xmlDoc, null);
    pdf.IsTruetypeFontMapCached = true;
    pdf.TruetypeFontMapPath = Path.GetTempPath();
    pdf.Save(pdfname);
}

rex · March 20, 2006, 8:40am

That does the trick. Thanks for your help!