Word to PDF Formatting Issues

We are experiencing issues converting docs to PDFs. For the most part they work but we have had isseus with:

  1. Headers and Footers
  2. Watermarks
  3. Line Breaks.

I have attached some samples of the docs and the onverted PDFs. Here is the code being used:

Document doc = new Document(sourceFileFullPath);
// Declare Memory Streams
using(System.IO.MemoryStream xmlStream = new MemoryStream())
{
    using(System.IO.MemoryStream pdfStream = new MemoryStream())
    {
        // Save the document in Aspose.Pdf.Xml format.
        doc.SaveOptions.PdfExportImagesFolder = Path.GetTempPath();
        doc.Save(xmlStream, SaveFormat.AsposePdf);
        // Read the document in Aspose.Pdf.Xml format into Aspose.Pdf.
        Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();
        pdf.BindXML(xmlStream, null);
        // Instruct to delete temporary image files.
        pdf.IsImagesInXmlDeleteNeeded = true;
        // Produce the PDF file.
        pdf.Save(destinationFileFullPath);
    }
}

Hi

Thanks for your request. Currently Aspose.Words supports two ways of PDF conversion: direct conversion (without using Aspose.Pdf) and legacy conversion (Aspose.Words+Aspose.Pdf). See the following link for more information:
https://docs.aspose.com/words/net/convert-a-document-to-pdf/
So, you can try using new method to convert your document to PDF. Here is the code:

Document doc = new Document("in.doc");
doc.SaveToPdf("out.pdf");

In this case the output PDF documents looks much better.
Best regards.

I have changed the code to simply create a PDF document and use the SaveToPDF Function but I still have issues. See attached for example.

Hi,

Thank you for additional information. Line breaks are missed because there are SmartTags in the paragraphs. I linked this thread to the appropriate issue.
As a workaround, you can try removing SmartTags from the document before converting to PDF. Here is code:

Document doc = new Document(@"Doc.doc");
RemoveSmartTags(doc);
doc.SaveToPdf(@"out.pdf");
/// 
/// Removes all SmartTYag nodes from the docuemnt, preserving content
///
/// Input document
private void RemoveSmartTags(Document doc)
{
    // Get collection of SmartTags from the document
    NodeCollection nodes = doc.GetChildNodes(NodeType.SmartTag, true, true);
    // Loop while there is SmartTags in the document
    while (nodes.Count> 0)
    {
        SmartTag tag = (SmartTag) nodes[0];
        // Get parent node of smartTag.
        // we should move all content from smatrTag to its parent to preserve documents content
        CompositeNode parent = tag.ParentNode;
        // Loop throuht all nodes inside smartTag and move its convent to parent node
        while (tag.HasChildNodes)
            parent.InsertBefore(tag.FirstChild, tag);
        // Remove smartTag
        tag.Remove();
    }
}

Hope this helps.
Best regards.

I am still getting the same outcome. Did you get a different outcome with that code?

Hi

Thanks for your inquiry. This code resolves the problem with address box.
There is also problem with text frame in the header. Currently, Aspose.Words does not fully support positioning of frames during rendering and converting to PDF.
As a workaround, you can try using textboxes instead of frames.
Best regards.

doc.SaveToPdf(sPath)

It destroys the format and it does not give roman page numbering correcly. Binding XML gives correct output but it is expensive in memory wise.
Is there any enhancement we can do with pdf’s XML binding?

Hi

Thanks for your inquiry. The only thing you can do is refactoring of your document.
IF you would like to improve performance, you can try setting IsTruetypeFontMapCached option of Aspose.Pdf.
http://www.aspose.com/documentation/file-format-components/aspose.pdf-for-.net-and-java/aspose.pdf.pdf.istruetypefontmapcached.html
Best regards.

Thank’s,
But I have millions of documents, I can not change the existing document.
Is it possible to replace the frames by textboxes with the object nodes dynamically?
It’s really important, we are really interested in your product.
Best regards.

Hi

Thanks for your request. Unfortunately, there is no way to replace frames with text boxes programmatically. You can do this only manually in MS Word.
Best regards.

The issues you have found earlier (filed as 8182) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(2)

The issues you have found earlier (filed as 7665) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as WORDSNET-2219) have been fixed in this .NET update and in this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(135)

Hi Team,

I am upgrading Aspose.PDF version 7.3 to the latest version.

After upgrade pdf.IsImagesInXmlDeleteNeeded = true; is not found in the 24.1 version.

providing my code below. How to convert the code according to 24.1 standard

Stream pdfOutPutStream = null;
MemoryStream xmlDoc = null;

try
{
    LoadLicense();

    string PdfImagesFolder;

    pdfOutPutStream = CreateOutPDFStream(fileName);
    Aspose.Words.Document srcDoc = new Aspose.Words.Document(fileName);

    //srcDoc.SaveOptions.ExportImagesFolder = Path.GetDirectoryName(fileName);

    xmlDoc = new MemoryStream();
    srcDoc.Save(xmlDoc, Aspose.Words.SaveFormat.Pdf);
    xmlDoc.Position = 0;

    Aspose.Pdf.Document pdf = new Aspose.Pdf.Document();
    pdf.BindXml(xmlDoc, null);
    pdf.IsImagesInXmlDeleteNeeded = true;

    pdf.Save(pdfOutPutStream);
}

@Febinbabu As I can see your goal is simple Word to PDF conversion. There is no need to use Aspose.PDF for this. You can achieve this using the following simple code:

Aspose.Words.Document srcDoc = new Aspose.Words.Document(fileName);
srcDoc.Save(pdfOutPutStream, Aspose.Words.SaveFormat.Pdf);

Please see our documentation for more information:
https://docs.aspose.com/words/net/convert-a-document-to-pdf/

A post was split to a new topic: Covert images into pdf

A post was split to a new topic: Issue while spltting the document using PdfFileEditor.Extract