Differences in PDF Output


#1

Hi,

We are using Apose.Words & Aspose.PDF to generate Word docs and save them out as PDF docs.

Our process invoves creating a Words document in memory and then converting this to PDF. We then save the created Words document to disk

We also have a requirement to convert the saved document to a PDF at a future time.

The problem is that the resultant PDF's are different (from the 'same' word doc) depending on where we generate it from.

The xml produced by Aspose.Words.Document.Save with FormatAsposePDF is different for both scenarios. There are extra tags in the xml produced from the disk version of the document. I have attached the xml.

Any help would be greatly appreciated.

Regards,

Gary Woods.


#2

Hi Gary,

Thanks for reporting this problem to us. It would be very helpful if you also send us the document in doc format, from which file 'w363-fromdisk.xml' was generated.

Meanwhile, I have logged this problem to our defect base as Issue #956.

Best Regards,


#3

It seems that the difference is caused by empty cells.

The issue is that while the document is created in Aspose.Words in memory, having a Cell object without child nodes is not a problem. The model is valid.

However, when the document is saved into DOC file, every empty cell is "validated" by adding an empty paragraph. There can be no absolutely empty cells in a Word document.

This causes the difference. When you write into PDF from memory the Cell objects have no child nodes.

Later when you save into DOC and reload DOC, all empty cells get one Paragraph object inside them. When you save such a document into PDF it becomes different from the original.

As a workaround, you need to make sure you don't have Cell nodes without children in the document. The following code will help you to do that:

foreach(Cell cell in doc.GetChildNodes(NodeType.Cell, true))

{

cell.EnsureMinimum();

}


#4

Vladimir,

Thanks for that. The workaround did the job.

Kind regards,

Gary.