Word to PDF Formatting Issues

alocurto2 · June 22, 2009, 5:34pm

We are experiencing issues converting docs to PDFs. For the most part they work but we have had isseus with:

Headers and Footers
Watermarks
Line Breaks.

I have attached some samples of the docs and the onverted PDFs. Here is the code being used:

Document doc = new Document(sourceFileFullPath);
// Declare Memory Streams
using(System.IO.MemoryStream xmlStream = new MemoryStream())
{
    using(System.IO.MemoryStream pdfStream = new MemoryStream())
    {
        // Save the document in Aspose.Pdf.Xml format.
        doc.SaveOptions.PdfExportImagesFolder = Path.GetTempPath();
        doc.Save(xmlStream, SaveFormat.AsposePdf);
        // Read the document in Aspose.Pdf.Xml format into Aspose.Pdf.
        Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();
        pdf.BindXML(xmlStream, null);
        // Instruct to delete temporary image files.
        pdf.IsImagesInXmlDeleteNeeded = true;
        // Produce the PDF file.
        pdf.Save(destinationFileFullPath);
    }
}

alexey.noskov · June 23, 2009, 4:21am

Hi

Thanks for your request. Currently Aspose.Words supports two ways of PDF conversion: direct conversion (without using Aspose.Pdf) and legacy conversion (Aspose.Words+Aspose.Pdf). See the following link for more information:
https://docs.aspose.com/words/net/convert-a-document-to-pdf/
So, you can try using new method to convert your document to PDF. Here is the code:

Document doc = new Document("in.doc");
doc.SaveToPdf("out.pdf");

In this case the output PDF documents looks much better.
Best regards.

alocurto2 · June 23, 2009, 8:18am

I have changed the code to simply create a PDF document and use the SaveToPDF Function but I still have issues. See attached for example.

alexey.noskov · June 23, 2009, 9:52am

Hi,

Thank you for additional information. Line breaks are missed because there are SmartTags in the paragraphs. I linked this thread to the appropriate issue.
As a workaround, you can try removing SmartTags from the document before converting to PDF. Here is code:

Document doc = new Document(@"Doc.doc");
RemoveSmartTags(doc);
doc.SaveToPdf(@"out.pdf");

/// 
/// Removes all SmartTYag nodes from the docuemnt, preserving content
///
/// Input document
private void RemoveSmartTags(Document doc)
{
    // Get collection of SmartTags from the document
    NodeCollection nodes = doc.GetChildNodes(NodeType.SmartTag, true, true);
    // Loop while there is SmartTags in the document
    while (nodes.Count> 0)
    {
        SmartTag tag = (SmartTag) nodes[0];
        // Get parent node of smartTag.
        // we should move all content from smatrTag to its parent to preserve documents content
        CompositeNode parent = tag.ParentNode;
        // Loop throuht all nodes inside smartTag and move its convent to parent node
        while (tag.HasChildNodes)
            parent.InsertBefore(tag.FirstChild, tag);
        // Remove smartTag
        tag.Remove();
    }
}

Hope this helps.
Best regards.

alocurto2 · June 23, 2009, 10:35am

I am still getting the same outcome. Did you get a different outcome with that code?

alexey.noskov · June 23, 2009, 10:48am

Hi

Thanks for your inquiry. This code resolves the problem with address box.
There is also problem with text frame in the header. Currently, Aspose.Words does not fully support positioning of frames during rendering and converting to PDF.
As a workaround, you can try using textboxes instead of frames.
Best regards.

caci · June 23, 2009, 3:50pm

doc.SaveToPdf(sPath)

It destroys the format and it does not give roman page numbering correcly. Binding XML gives correct output but it is expensive in memory wise.
Is there any enhancement we can do with pdf’s XML binding?

alexey.noskov · June 24, 2009, 2:50am

Hi

Thanks for your inquiry. The only thing you can do is refactoring of your document.
IF you would like to improve performance, you can try setting IsTruetypeFontMapCached option of Aspose.Pdf.
http://www.aspose.com/documentation/file-format-components/aspose.pdf-for-.net-and-java/aspose.pdf.pdf.istruetypefontmapcached.html
Best regards.

mikevan88 · July 10, 2009, 7:08am

Thank’s,
But I have millions of documents, I can not change the existing document.
Is it possible to replace the frames by textboxes with the object nodes dynamically?
It’s really important, we are really interested in your product.
Best regards.

alexey.noskov · July 10, 2009, 7:30am

Hi

Thanks for your request. Unfortunately, there is no way to replace frames with text boxes programmatically. You can do this only manually in MS Word.
Best regards.

aspose.notifier · January 19, 2010, 10:15am

The issues you have found earlier (filed as 8182) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(2)

aspose.notifier · May 14, 2010, 8:37am

The issues you have found earlier (filed as 7665) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

aspose.notifier · July 1, 2011, 2:31am

The issues you have found earlier (filed as WORDSNET-2219) have been fixed in this .NET update and in this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(135)

Febinbabu · January 16, 2024, 9:43am

Hi Team,

I am upgrading Aspose.PDF version 7.3 to the latest version.

After upgrade pdf.IsImagesInXmlDeleteNeeded = true; is not found in the 24.1 version.

providing my code below. How to convert the code according to 24.1 standard

Stream pdfOutPutStream = null;
MemoryStream xmlDoc = null;

try
{
    LoadLicense();

    string PdfImagesFolder;

    pdfOutPutStream = CreateOutPDFStream(fileName);
    Aspose.Words.Document srcDoc = new Aspose.Words.Document(fileName);

    //srcDoc.SaveOptions.ExportImagesFolder = Path.GetDirectoryName(fileName);

    xmlDoc = new MemoryStream();
    srcDoc.Save(xmlDoc, Aspose.Words.SaveFormat.Pdf);
    xmlDoc.Position = 0;

    Aspose.Pdf.Document pdf = new Aspose.Pdf.Document();
    pdf.BindXml(xmlDoc, null);
    pdf.IsImagesInXmlDeleteNeeded = true;

    pdf.Save(pdfOutPutStream);
}

alexey.noskov · January 16, 2024, 9:50am

@Febinbabu As I can see your goal is simple Word to PDF conversion. There is no need to use Aspose.PDF for this. You can achieve this using the following simple code:

Aspose.Words.Document srcDoc = new Aspose.Words.Document(fileName);
srcDoc.Save(pdfOutPutStream, Aspose.Words.SaveFormat.Pdf);

Please see our documentation for more information:
https://docs.aspose.com/words/net/convert-a-document-to-pdf/

alexey.noskov · January 16, 2024, 12:38pm

A post was split to a new topic: Covert images into pdf

vyacheslav.deryushev · June 24, 2024, 1:56pm

A post was split to a new topic: Issue while spltting the document using PdfFileEditor.Extract