Aspose Word - MailMerge HTML tables with CSS introducing blank lines

Hello Support,
We are facing issue with Aspose Word library when HTML content replaced for mail merge key value pair introducing additonal blank lines between tables. Attached sample solution and screenshot from MS word showing blank lines.
We are looking for your help to co ntrol how blank lines are included when HTML content is placed for mail merge field data.
Aspose_19_09_pml_sample.zip (5.1 MB)
table_issue_MSWord_2.png (35.1 KB)
table_issue_MSWord_1.png (28.4 KB)

Thank you,
Parthiban

@parthiban.natarajan,

But, you will notice the same behavior when you will “Save As” this PML.html file to DOCX format by using MS Word 2019 (msw-2019.docx (16.0 KB)).

But, you can workaround this problem by using the following code:

Document doc = new Document(@"C:\Temp\Aspose_19_09_pml_sample\App_Data\\PML.html");

foreach (Table table in doc.GetChildNodes(NodeType.Table, true))
{
    if (string.IsNullOrEmpty(table.ToString(SaveFormat.Text).Trim()) && table.Rows.Count == 1)
        table.Remove();
}

foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
    if (string.IsNullOrEmpty(para.ToString(SaveFormat.Text).Trim()) && para.ParagraphBreakFont.Hidden)
        para.ParagraphBreakFont.Size = 1;
}

doc.Save(@"C:\Temp\Aspose_19_09_pml_sample\App_Data\\21.9.docx");

Hi Hafeez,
Thank you for code samples. I have tried with docx to load and do clean up of hidden paragraph/ font text. But it does not really clear the blank lines. Your sample is loading HTML content directly into Word object and it does remove the blank lines.

Do you know why there is different behavior between word docx file stream not able to clear hidden text but html word stream does?

I am also looking to see if there is any option to use document stream for mail field data?

Thanks,
Parthiban

@parthiban.natarajan,

I think, you can build logic on the following code to get the desired output:

Document doc = new Document(@"C:\Temp\Aspose_19_09_pml_sample\\Member- English- Sample.docx");
MemoryStream wdStream = new MemoryStream();
doc.MailMerge.FieldMergingCallback = new HandleMergeFieldInsertHtml();
string strDMRdrugswithexplanations = getPMLHtml();
if (strDMRdrugswithexplanations.IndexOf("href='PML.css'") > 0)
    strDMRdrugswithexplanations = strDMRdrugswithexplanations.Replace("href='PML.css'", "href='" + "C:\\Temp\\Aspose_19_09_pml_sample\\PML.css'");

doc.MailMerge.Execute(new string[] { "PML" }, new string[] { strDMRdrugswithexplanations });

doc.Save(@"C:\Temp\Aspose_19_09_pml_sample\\21.10 custom code.docx");

internal class HandleMergeFieldInsertHtml : IFieldMergingCallback
{
    public void ImageFieldMerging(ImageFieldMergingArgs args)
    {
        throw new NotImplementedException();
    }

    /// <summary>
    /// This is called when merge field is actually merged with data in the document.
    /// </summary>
    void IFieldMergingCallback.FieldMerging(FieldMergingArgs e)
    {
        //string fontStyle = "<head></head><body><font isUnicode='true' face='Arial' size=2><table><tr><td>Sample text with Custome font Embedded </td></tr></table></font><br><font isUnicode='true' face='Courier New' size=10><s>Sample Text </s>in <u>Courier New</u> font</font></body>";
        // All merge fields that expect HTML data should be marked with some prefix, e.g. 'html'.
        if (string.Compare(e.FieldName, "PML", true) == 0 && e.FieldValue != null)
        {
            // Insert the text for this merge field as HTML data, using DocumentBuilder.
            DocumentBuilder builder = new DocumentBuilder(e.Document);
            builder.MoveToMergeField(e.DocumentFieldName);

            HandleNodeChanging handler = new HandleNodeChanging();
            e.Document.NodeChangingCallback = handler;

            builder.InsertHtml(e.FieldValue.ToString(), true);
            e.Document.NodeChangingCallback = null;

            // Process Table(s) reference that we just inserted via InsertHtml
            foreach (Table table in handler.InsertedTables)
                if (string.IsNullOrEmpty(table.ToString(SaveFormat.Text).Trim()) && table.Rows.Count == 1)
                    table.Remove();

            // Process Paragraph(s) reference that we just inserted via InsertHtml
            foreach (Paragraph para in handler.InsertedParagraphs)
                if (string.IsNullOrEmpty(para.ToString(SaveFormat.Text).Trim()) && para.ParagraphBreakFont.Hidden)
                    para.ParagraphBreakFont.Size = 1;

            e.Text = "";
        }
    }
}

private static string getPMLHtml()
{
    return File.ReadAllText(@"C:\Temp\Aspose_19_09_pml_sample\App_Data\PML.html");
}


public class HandleNodeChanging : INodeChangingCallback
{
    void INodeChangingCallback.NodeInserted(NodeChangingArgs args)
    {
        if (args.Node.NodeType == NodeType.Table)
            mInsertedTables.Add(args.Node);
                
        if (args.Node.NodeType == NodeType.Paragraph)
            mInsertedParagraphs.Add(args.Node);
    }

    void INodeChangingCallback.NodeInserting(NodeChangingArgs args)
    {
        // Do Nothing
    }

    void INodeChangingCallback.NodeRemoved(NodeChangingArgs args)
    {
        // Do Nothing
    }

    void INodeChangingCallback.NodeRemoving(NodeChangingArgs args)
    {
        // Do Nothing
    }

    public List<Node> InsertedTables
    {
        get { return mInsertedTables; }
    }
    public List<Node> InsertedParagraphs
    {
        get { return mInsertedParagraphs; }
    }

    private readonly List<Node> mInsertedTables = new List<Node>();
    private readonly List<Node> mInsertedParagraphs = new List<Node>();
}

Hi Hafeez,
Thank you for the code samples. I used it and notice same issue and docx output is different than html directly saved as docx file. I am attaching the difference between them for you reference. I am also attached revised sample solution for your review.
Thanks,
Parthiban
Member- English- Sample_PML_Updated.docx (20.5 KB)
Member- English- Sample_PML_Updated_HiddenRemoved.docx (7.3 KB)
Aspose_19_09_pml_sample2.zip (5.1 MB)

@parthiban.natarajan,

Please also provide your expected DOCX Word file showing the desired output here for our reference. You can create this file manually by using MS Word. Please also list the complete steps that you performed in MS Word to create the expected file. We will then provide you code to achieve the same output by using Aspose.Words.

Hi Hafeez,
Please find MS word document with expected output and screen shot showing the particular area where we are facing issue with blank lines.

Step 1: MS Word document (Member- English- Sample.docx) has updated with mail merge fields.
Step 2: Using Aspose library we did update html (PML_NoSep.html) to one of mail merge field named “PML”. This file is in the app_data folder.
Step 3: Post mail merge, MS word document saved as “Member- English- Sample_PML_Updated.docx”
Step 4: Step 3 MS word document has issue of blank lines between empty table (0.2 inch height maintainer). I now manually removed empty lines and attached expected document here for your reference “Member- English- Sample_PML_ExpectedOutput.docx”.

We are looking for your help in achieving the desired outcome as soon as we do mail merge.
Thanks,
Parthiban

Member- English- Sample.docx (25.7 KB)
Member- English- Sample_PML_Updated.docx (20.5 KB)
Expected_outcome_SS.png (98.2 KB)
Member- English- Sample_PML_ExpectedOutput.docx (26 KB)

@parthiban.natarajan,

You can simply keep the tables and remove empty paragraphs by using the following code:

internal class HandleMergeFieldInsertHtml : IFieldMergingCallback
{
    public void ImageFieldMerging(ImageFieldMergingArgs args)
    {
        throw new NotImplementedException();
    }

    /// <summary>
    /// This is called when merge field is actually merged with data in the document.
    /// </summary>
    void IFieldMergingCallback.FieldMerging(FieldMergingArgs e)
    {
        //string fontStyle = "<head></head><body><font isUnicode='true' face='Arial' size=2><table><tr><td>Sample text with Custome font Embedded </td></tr></table></font><br><font isUnicode='true' face='Courier New' size=10><s>Sample Text </s>in <u>Courier New</u> font</font></body>";
        // All merge fields that expect HTML data should be marked with some prefix, e.g. 'html'.
        if (string.Compare(e.FieldName, "PML", true) == 0 && e.FieldValue != null)
        {
            // Insert the text for this merge field as HTML data, using DocumentBuilder.
            DocumentBuilder builder = new DocumentBuilder(e.Document);
            builder.MoveToMergeField(e.DocumentFieldName);

            HandleNodeChanging handler = new HandleNodeChanging();
            e.Document.NodeChangingCallback = handler;

            builder.InsertHtml(e.FieldValue.ToString(), true);
            e.Document.NodeChangingCallback = null;

            // Process Paragraph(s) reference that we just inserted via InsertHtml
            foreach (Paragraph para in handler.InsertedParagraphs)
                if (string.IsNullOrEmpty(para.ToString(SaveFormat.Text).Trim()) && para.ParagraphBreakFont.Hidden)
                    para.Remove();

            e.Text = "";
        }
    }
}