Table of Content - One item is empty in Mail Merge

We are using Table of Content in the docx document and implemeting mail merge. For some reason, the “6” item becomes empty thus pushing table to add one more.

  1. 23
  2. Appendix A: Title A…23
  3. Appendix B: Title A. Continued…24
  4. Appendix C: Title C …26
  5. Appendix D: Title D…27

In the original document, this is how it looks like

–Page Break-- --Section Break
6. Appendix A: Title A
7. Appendix B: Title A. Continued.
8. Appendix C: Title C .

After the mail merge

  1. Appendix A: Title A
  2. Appendix B: Title A. Continued.
  3. Appendix C: Title C .

Is there anyway to remove the “6” empty ToC item and recalculate the ToC?

@heslabon

To address the issue of an empty item in the Table of Contents (ToC) after performing a mail merge, you can utilize the cleanup options provided by Aspose.Words. Specifically, you can set the CleanupOptions property to remove empty paragraphs and unused fields during the mail merge process. This will help eliminate any empty entries in the ToC that may have resulted from the merge.

Here’s how you can implement this in your code:

using Aspose.Words;
using Aspose.Words.MailMerging;

// Load your document
Document doc = new Document("path/to/your/document.docx");

// Set cleanup options to remove empty paragraphs and unused fields
doc.MailMerge.CleanupOptions = MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS | 
                               MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS;

// Execute the mail merge with your data
doc.MailMerge.Execute(dataTable);

// Save the updated document
doc.Save("path/to/save/updated_document.docx");

In this example, the REMOVE_EMPTY_PARAGRAPHS option will ensure that any empty paragraphs created during the mail merge are removed, which should help in eliminating the empty ToC item. Additionally, the REMOVE_UNUSED_FIELDS option will remove any merge fields that do not have corresponding data, further cleaning up the document.

After executing the mail merge with these options, you may need to update the Table of Contents to reflect the changes. You can do this by right-clicking on the ToC in Microsoft Word and selecting “Update Field” to refresh it.

By following these steps, you should be able to remove the empty ToC item and ensure that your document is formatted correctly after the mail merge.

For more detailed information on cleanup options during mail merge, you can refer to the Aspose documentation here.

Sources:
[1]: Clean Up Before or During a Mail Merge Operation - Aspose Documentation

@heslabon Could you please attach your input and output documents along with code that will allow us to reproduce the problem? We will check the issue and provide you more information.

This is the input document
Test-Aspose.docx (1.5 MB)

This is the output (see ToC item #5)

Test-Aspose.pdf (115.5 KB)

 _document.MailMerge.CleanupOptions = MailMergeCleanupOptions.RemoveEmptyParagraphs | MailMergeCleanupOptions.RemoveUnusedRegions | MailMergeCleanupOptions.RemoveUnusedFields;

_document.MailMerge.FieldMergingCallback = new HandleMailMergeField();
_document.MailMerge.ExecuteWithRegions(dataSource);

 if (_document.MailMerge.GetFieldNames().Any())
 {
     dataSource.Reset();
     _document.MailMerge.Execute(dataSource);
 }

@heslabon Thank you for additional information. Could you please also provide you data source and your implementation of HandleMailMergeField. As I can see from your output document, the problem occurs because paragraph break is inserted at the beginning of the 5th item. So there is another empty heading paragraph in the output.

void IFieldMergingCallback.FieldMerging(FieldMergingArgs e)
{
    if (string.IsNullOrEmpty(e.DocumentFieldName))
    {
        return;
    }

    if (e.DocumentFieldName.StartsWith(MailMergeManager.SectionBreakFieldPrefix))
    {
        if (_builder == null)
        {
            _builder = new DocumentBuilder(e.Document);
        }

        _builder.MoveToMergeField(e.DocumentFieldName);
        string bookmark = MailMergeManager.BookMarkPrefix + "_" + e.DocumentFieldName.Split('_')[1] + "_" + _i;
        _builder.StartBookmark(bookmark);
        _builder.EndBookmark(bookmark);

        _i++;
    }
}

Data source

 public bool GetValue(string fieldName, out object fieldValue)
 {
     if (string.IsNullOrEmpty(fieldName))
     {
         fieldValue = null;
         return false;
     }

     if (_index <= 0)
     {
         ApplySorts(fieldName);
     }

     if (_index >= _businessObjects.Count || OrderByExpressionParser.IsOrderByExpression(fieldName))
     {
         fieldValue = null;
         return false;
     }

     var businessObject = _businessObjects[_index];

     IEnumerable<object> valuesToJoin;
     try
     {
         if (MailMergeGlobalFields.GetValue(fieldName, out fieldValue))
         {
             return true;
         }

         if (_translator != null)
         {
             fieldName = _translator.TranslateField(fieldName);
         }

         fieldName = NormalizeEntityName(fieldName);
         var result = businessObject.Eval(fieldName);
         if (result.PropertyExpression.EndElement.PropertyInfo.IsSimple)
         {
             var values = result.Values.Select(value => MailMergePropertyFormatter.FormatValue(result.PropertyExpression.EndElement.PropertyInfo, value));
             fieldValue = values.OfType<byte[]>().FirstOrDefault();
             if (fieldValue != null)
             {
                 return true;
             }
             valuesToJoin = values.OrderBy(v => v);
         }
         else
         {
             valuesToJoin = result.Objects.OrderBy(MultipleFieldBusinessObjectComparer.GetDefaultSortPropertyValue);
         }
     }
     catch (ExpressionParseException)
     {
         fieldValue = null;
         return true;
     }

     fieldValue = String.Join(", ", valuesToJoin);
     return true;
 }

Not sure if those datasource and handleMailMerge field is going to help but is it possible to ignore or delete the paragraph break if it’s ToC to avoid this empty heading?

@heslabon I am afraid, the provided code still does not allow to test your scenario on our side.

You can try removing empty heading paragraphs in your document:

Document doc = new Document(@"C:\Temp\in.docx");
// Remove empty paragraphs of the first outline level.
doc.GetChildNodes(NodeType.Paragraph, true).Cast<Paragraph>()
    .Where(p => p.ParagraphFormat.OutlineLevel == OutlineLevel.Level1 && !p.HasChildNodes).ToList()
    .ForEach(p => p.Remove());

doc.UpdateFields();
doc.Save(@"C:\Temp\out.docx");

This does not work as well. Do you have any suggestion?

I think I wont be able to provide the full code since we have own implementation for dataSource.

@heslabon I am afraid it is difficult to suggest something without ability to reproduce the problem on our side. If possible, could you please create a simple console application that will allow us to reproduce the problem? We will investigate the problem and provide you more information.