Word Document Template - Data Merge Rendering Issue

We are facing very critical issue, when we upgrading the Aspose.Words “19.10” to the latest version.

After we merge the data in word document template and save as pdf, the data-bound fields shows data without html rendering. Previously it is showing fine. Normal data without html tags working good.

Please see the attached program file to execute it and also attached sample template and merge data file. Review the final output document (before and after upgrade).

using (var docStream = new MemoryStream())
{
    // load word document template
    byte[] template = File.ReadAllBytes("MergeDataTemplate.docx");
    docStream.Write(template, 0, template.Length);

    using (var pkg = Package.Open(docStream, FileMode.Open, FileAccess.ReadWrite))
    {
        using (var wordDoc = WordprocessingDocument.Open(pkg))
        {
            // load merge data for word document
            string mergeDataXml = File.ReadAllText("MergeData.xml");
            var mergeDataDoc = new XmlDocument();
            mergeDataDoc.LoadXml(mergeDataXml);

            // set merge data to word document template
...

            // merge data in word document template
...
            // save the word document template
            wordDoc.MainDocumentPart.Document.Save();
            wordDoc.Close();
        }
    }

    docStream.Seek(0, SeekOrigin.Begin);

    // save the word template to pdf
    using (var pdfStream = new MemoryStream())
    {
        var saveOptions = (PdfSaveOptions)SaveOptions.CreateSaveOptions(SaveFormat.Pdf);
        saveOptions.FontEmbeddingMode = PdfFontEmbeddingMode.EmbedNonstandard;
        saveOptions.ImageCompression = PdfImageCompression.Jpeg;
        saveOptions.JpegQuality = 90;

        var document = new Document(docStream);
        document.Save(pdfStream, saveOptions);

        pdfStream.Seek(0, SeekOrigin.Begin);

        using (var fileStream = File.OpenWrite("FinalDocument.pdf"))
        {
            pdfStream.Seek(0, SeekOrigin.Begin);
            pdfStream.CopyTo(fileStream);
        }
    }
}

See attached file for the whole program.

FinalDocument_ACTUAL.pdf (39.4 KB)
FinalDocument_EXPECTED.pdf (67.8 KB)
MergeDataTemplate.docx (24.2 KB)
MergeData.xml.zip (1.1 KB)
Program.cs.zip (1.5 KB)

@trizetto,

We tested the scenario and have managed to reproduce the same problem on our end. For the sake of any corrections in Aspose.Words API, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-22507. We will further look into the details of this problem and will keep you updated on the status of linked issue. We apologize for your inconvenience.

I checked the program with different version of Aspose Words. This issue coming from version 20.10

There is no html rendering issue between 19.10 to 20.9 versions.

@trizetto,

We have completed the analysis of WORDSNET-22507 and the root cause has been determined.

I have also logged these details in our issue tracking system and will keep you posted here on any further updates.

@awais.hafeez,

We are still facing the issue even after upgrading to version 21.8
This time it partially render the html, not the same as in version 19.10
Please see attached documents.

FinalDocument_ACTUAL_21.8_FIX.pdf (61.6 KB)
FinalDocument_EXPECTED.pdf (67.8 KB)
Program.cs.zip (1.5 KB)
MergeDataTemplate.docx (24.2 KB)
MergeData.xml.zip (1.1 KB)

@trizetto,

But, setting the SaveOptions.FlatOpcXmlMappingOnly property to false produces desired output in PDF on my end. Please check, I have produced the following two PDF files by using the code you provided:

Regarding the partial rendering of content, the 19.10 version of Aspose.Words for .NET also produces similar output in PDF (see FinalDocument_19.10.pdf (39.4 KB)). Can you please double check if you had shared the correct resources (MergeData.xml etc) that you are actually getting this problem with?

The issues you have found earlier (filed as WORDSNET-22507) have been fixed in this Aspose.Words for .NET 21.8 update and this Aspose.Words for Java 21.8 update.

Please check this attached document template with previous xml data. It will give the same partial rendering issue that i forwarded earlier.

MergeDataTemplate.docx (27.1 KB)

@trizetto,

To address this problem, we have logged a separate issue with ID WORDSNET-22609. We will further look into the details of this problem and will keep you updated here on the status of linked issue. We apologize for your inconvenience.

@awais.hafeez ,

Please can you provide the workaround for the issue.
We used the new version of Aspose.Words and after that HTML DataBound field rendering fine. When we removed the comments and update the page layout and save as PDF, the the HTML Data Bound field does not render (see attached files).
MergeDataSample.zip (23.8 KB)

@trizetto We missed to notify you that issue filled as WORDSNET-22609 is also resolved. The fix was included into 21.10.0 version of Aspose.Words.
I have checked your documents and code and as I ca see the code that uses DocumentFormat.OpenXml.Packaging.WordprocessingDocument produces FlatOpc document with not rendered HTML. So the problem is not in Aspose.Words rendering, but in the code you are using to merge custom xml part.

@alexey.noskov,
I understand what are you saying but when we call the UpdatePageLayout() method of Aspose.Words.Document class, it causing the rendering issue.
If you run the same code in Aspose.Words version 19.10 , the document rendering working fine.

Please find a program attached that demonstrates the issue we are still seeing. Program.cs.zip (1.9 KB)

@TZTCS @trizetto You are right, calling UpdatePageLayout causes the problem. The reason is that calling UpdatePageLayout build document layout with default fixed page format save options, where FlatOpcXmlMappingOnly property is enabled (default value). But you need set both LoadOptions.FlatOpcXmlMappingOnly and SaveOptions.FlatOpcXmlMappingOnly to false to allow arbitrary document format mapped.
Since after executing UpdatePageLayout document layout is cached, layout is not rebuilt with new SaveOptions.FlatOpcXmlMappingOnly and you see the result, which is produced with this option enabled.
You need to remove UpdatePageLayout call to get the desired result.

@alexey.noskov,
If we stopped calling the Document.UpdatePageLayout(), will the process flow properly?
Please see attached program, where we remove comment in document template and do some other stuff. Then we call the UpdatePageLayout() method, as per recommended in your API documentation.
Program.cs.zip (1.9 KB)

@trizetto UpdatePageLayout is not required in your code since this action is executed upon saving to PDF.