How to check complexity of mail merge template using Java

We found a lot of cases that the mail merge would fail for the template which contains 4 layers of nested tables and many nested conditional statements, even the data set for merging is small.

We are looking for a complexity measurement of a word template, so we could detect those templates and prevent our merging server from crashing.

I am not sure if Aspose library contains this kind of measurement API, or is there a solution for us to measure a word template?

We can calculate the layers of nested tables, but how can I calculate the layer number of nested conditional statements?

BTW, we are using 18.4.

Thank you

@zwang

If you want to calculate the IF fields between mail merge regions, you can calculate them by iterating over nodes.

It would be great if you please share some more detail about calculation of conditional statements along with following details:

  • Your input Word document.
  • Please attach the output Word document.
  • Please share your expected output.

We will investigate the issue and provide you more information about your query. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

high-complexity.doc.zip (32.2 KB)

It kept failing to merge this template. It would consume all memories of the server, as a result, the server cannot accept any requests.

We just want to know how to measure the complexity of a template, then we can detect the issue before it happens.

@zwang

First you need to check how much time your code takes to fetch the data for mail merge region. Secondly, you need to remove unnecessary regions from the template. Your template document contains 11 regions. Please merge regions that have one-to-one relationship between tables.

Your document contains include picture fields using web link. If the links are same, please update this field after mail merge.

If you are generating document for huge number of records, it will take time. Please check at your end either Document.Save takes time or mail merge with region. We suggest you please use the latest version of Aspose.Words for .NET 20.4. Hope this helps you.

Actually, it would fail even for small data size. The time of fetching the data for mail merge could be ignored compared to the time spent on merging a document.

The template is from our tenant, we are not able to ask them to fix the template if we can’t measure the complexity automatically. That’s why I need to understand which kind of template would lead to terrible performance.

Secondly, you need to remove unnecessary regions from the template. Your template document contains 11 regions.

I can’t tell which regions are unnecessary because the template belongs to our tenants.

Please merge regions that have one-to-one relationship between tables.

I am not sure if I understand it. The tables here, do you mean MS Word Table, or data table in DB?

I would like to ask some questions to use Aspose effectively:
Would the number of regions or the number of nested regions have more impact on the rendering time?
Would many If fields in regions degrade the performance a lot?

fyi, a lot of merge fieds are hidden, you need to show hidden text.

You said the template contains 11 regions, but why I got the 15 regions through DocumentVisitor. There are two groups of 4-layers nested table(TaxItem_1, TaxItem_3, TaxItem_4, TaxItem_2) in the table TaxItem, but I can see only one 4-layers nested table from the Word document.

 MERGEFIELD TableStart:Subscription\* MERGEFORMAT 
 MERGEFIELD TableEnd:Subscription\* MERGEFORMAT 
 MERGEFIELD TableStart:InvoiceItem \* MERGEFORMAT 
 MERGEFIELD TableStart:InvoiceItem_ServiceGroup 
 MERGEFIELD TableStart:InvoiceItem_Charge 
 MERGEFIELD TableStart:InvoiceItem_q
MERGEFIELD TableEnd:InvoiceItem_q 
 MERGEFIELD TableEnd:InvoiceItem_Charge 
MERGEFIELD TableEnd:InvoiceItem_ServiceGroup 
 MERGEFIELD TableEnd:InvoiceItem 
 MERGEFIELD  TableStart:TaxItem  \* MERGEFORMAT 
 MERGEFIELD TableStart:TaxItem_1 
 MERGEFIELD TableStart:TaxItem_3 
 MERGEFIELD  TableStart:TaxItem_4 
 MERGEFIELD TableStart:TaxItem_2 
 MERGEFIELD TableEnd:TaxItem_2 
 MERGEFIELD TableEnd:TaxItem_4 
 MERGEFIELD TableEnd:TaxItem_3 
 MERGEFIELD TableEnd:TaxItem_1 
 MERGEFIELD TableStart:TaxItem_1 
 MERGEFIELD TableStart:TaxItem_3 
 MERGEFIELD  TableStart:TaxItem_4 
 MERGEFIELD TableStart:TaxItem_2 
 MERGEFIELD TableEnd:TaxItem_2 
 MERGEFIELD TableEnd:TaxItem_4 
 MERGEFIELD TableEnd:TaxItem_3 
 MERGEFIELD TableEnd:TaxItem_1 
 MERGEFIELD TableEnd:TaxItem 
 MERGEFIELD  TableStart:Transaction  \* MERGEFORMAT 
 MERGEFIELD  TableEnd:Transaction  \* MERGEFORMAT 

My code is:

public class DocumentVisitor extends com.aspose.words.DocumentVisitor {

    @Override public int visitFieldStart(FieldStart fieldStart) throws Exception {
        final Field field = fieldStart.getField();

        switch (field.getType()) {
            case FieldType.FIELD_MERGE_FIELD:

                final String fieldCode = field.getFieldCode(false);
              
                if (fieldCode.contains("TableStart") || fieldCode.contains("TableEnd")) {
                    System.out.println(fieldCode);
                  }
                break;
            default:
                break;
        }

        return super.visitFieldStart(fieldStart);
    }

@zwang

If you add many regions and IF fields in the document, the document gets complex.

Please note that performance hardly depends on local environment. It can be completely different for a server that generates thousands documents 24/7 or for a local PC that generate only the one document by demand.

If you are loading huge Word documents into Aspose.Words’ DOM, more memory would be required. This is because during processing, the document needs to be held wholly in memory. Usually, Aspose.Words needs 10 times more memory than the original document size to build a DOM in the memory.

We suggest you please use SaveOptions.MemoryOptimization property to optimize the memory performance. Setting this option to true can significantly decrease memory consumption while saving large documents at the cost of slower saving time. Hope this helps you.

If you still face problem, please attach the following resources here for testing:

  • Your input Word document.
  • Please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Thank you. I will collect the resources you need.
The last question is why I got 15 regions through the DocumentVisitor. It seems Aspose visited the tables (TaxItem_1, TaxItem_3, TaxItem_4, TaxItem_2) twice.

The code is attached above.

@zwang

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-20295 . You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

tahir.manzoor:

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-20295 . You will be notified via this forum thread once this issue is resolved.

Please let me know the pattern causing the problem if you figured out the root cause. We could ask our customers to avoid the patten before it’s resolved.
Thank you

@zwang

Please note that the issue (WORDSNET-20295) is related to incorrect count of TableStart fields. For performance issue, please use the latest version of Aspose.Words for .NET 20.4. If you face incorrect output and performance issue, please share the detail for testing as requested in my old post.

I see. Thank you.

@zwang

It is to inform you that the issue which you are facing is actually not a bug in Aspose.Words. So, we have closed this issue (WORDSNET-20295) as ‘Not a Bug’. Please use the following code example to get the correct regions.

Document doc = new Document(MyDir + "high-complexity.doc");
foreach (Field field in doc.Range.Fields)
{
    if (field.Type == FieldType.FieldMergeField)
    {
        if (field.GetFieldCode(false).Contains("TableStart"))
            Console.WriteLine(field.GetFieldCode(false));
    }
}
doc.Accept(new DocStructurePrinter());

public class DocStructurePrinter : DocumentVisitor
{
    public override VisitorAction VisitFieldStart(FieldStart start)
    {
        if (fieldResultEndChar == null)
        {
            if (start.FieldType == FieldType.FieldMergeField)
            {
                FieldMergeField field = (FieldMergeField)start.GetField();
                if (field.FieldName.StartsWith("TableStart") || field.FieldName.StartsWith("TableEnd"))
                    Console.WriteLine(field.FieldName);
            }
        }

        return VisitorAction.Continue;
    }

    public override VisitorAction VisitFieldSeparator(FieldSeparator fieldSeparator)
    {
        if (fieldResultEndChar == null)
            fieldResultEndChar = fieldSeparator.GetField().End;

        return VisitorAction.Continue;
    }

    public override VisitorAction VisitFieldEnd(FieldEnd fieldEnd)
    {
        if (fieldResultEndChar == fieldEnd)
            fieldResultEndChar = null;

        return VisitorAction.Continue;
    }

    private FieldEnd fieldResultEndChar;
}

Thank you for the investigation. The code is in .NET, could you confirm thant if it is the same behavior as in Java?

@zwang

Please accept my apology for your inconvenience. You can use the following code example to get the desired output. Hope this helps you.

Document doc = new Document(MyDir + "high-complexity.doc");
doc.accept(new DocumentVisitor());

public class DocumentVisitor extends com.aspose.words.DocumentVisitor {

    @Override
    public int visitFieldStart(FieldStart fieldStart) throws Exception {
        if (fieldResultEndChar == null)
        {
            if (fieldStart.getFieldType() == FieldType.FIELD_MERGE_FIELD)
            {
                FieldMergeField field = (FieldMergeField)fieldStart.getField();
                if (field.getFieldName().startsWith("TableStart") || field.getFieldName().startsWith("TableEnd"))
                    System.out.println(field.getFieldName());
            }
        }

        return super.visitFieldStart(fieldStart);
    }

    public int visitFieldSeparator(FieldSeparator fieldSeparator) throws Exception {
        if (fieldResultEndChar == null)
            fieldResultEndChar = fieldSeparator.getField().getEnd();

        return super.visitFieldSeparator(fieldSeparator);
    }

    public int visitFieldEnd(FieldEnd fieldEnd) throws Exception {

        if (fieldResultEndChar == fieldEnd)
            fieldResultEndChar = null;

        return super.visitFieldEnd(fieldEnd);
    }
    private FieldEnd fieldResultEndChar;
}

You can create a template document with merge fields by using any Word editor application, like Microsoft Word. By using Word editor application, you can take the advantage of the visual interface to design unique layout official site

@Kimbers

We always encourage positive feedback from our customers. Please let us know if you have any query related to Aspose.Words.