Aspose 11.10 and mailmerge regions

Dear Aspose Team,

I tried to use the version 11.10 of aspose.words for java instead the pretty old 10.7 which we have in productive use.

Unfortunately I found some issues. For one I found a possible solution in MailMerge Remove Empty Fields

The other issue is unresolved:

Our templates with multiple regions on top level cant be processed properly anymore: we have multiple possible datasources. I call for each of them mailmerge.executeWithRegions(). In 10.7 that works perfectly. In 11.10 the first call of executeWithRegions clears/removes all fields for which the getValue() on IMailMergeDataSource returns null. I tested further and find out it would work if all toplevel-regions would be under on new toplevel region.
We cant force our users to change all of their templates. So we depend on the old behaviour of version 10.7
I wrote the change of behaviour already on the 3rd of July 2012.
Version 11.1.0 & MailMerge Regions

I would be glad if their is a solution nearby.

Thanks a lot!
Kind regards,

Hi Jens,

Please accept my apologies for late response.

Thanks for your inquiry. I have worked with shared document at following forum link and have not found any issue.
https://forum.aspose.com/t/63907

ns1045:
I tested further and find out it would work if all toplevel-regions would be under on new toplevel region.

Could you please share some more detail about your query along with code/documents?

Hi Tahir,

sorry for late response. But I didnt get a notification about the change of your last post…

I tested again further and found the change of behaviour:

Until now we are using setRemoveEmptyRegions() and setRemoveEmptyParagraqphs() to avoid unregnonized fields & regions. I think in aspose version <11.0 these flag were used in the saveprocess of the document.
In 11.10 these two methods and the new one setCleanupOptions() are used in executeMailMerge() process.

Our mechanism to execute multiple regions is until now that we have multiple IMailMergeDataSources. We call for each one executeWithRegions(), so the new behaviour removes all regions which dont match the first datasource we use in executeWithRegions. So in next call auf executeWithRegions() with the next datasource all regions are already removed.

So our question is: do/did you want that change of behaviour ?

A workaround for us would be to set the cleanup options just before calling executeWithRegions with the last datasource. But I wonder although why this change of behaviour was made.

I have another question concerning CleanupOptions: how to combine multiple cleanup options like removeEmptyParagraphs, removeUnusedRegions, removeUnusedFields… ?
The API is a little bit strange, because set determines that one single value can be set.

Thanks a lot

Hi Jens,

Thanks for your inquiry.

ns1045:
Until now we are using setRemoveEmptyRegions() and setRemoveEmptyParagraqphs() to avoid unregnonized fields & regions. I think in aspose version <11.0 these flag were used in the saveprocess of the document.
In 11.10 these two methods and the new one setCleanupOptions() are used in executeMailMerge() process.

Yes, the setRemoveEmptyParagraphs and setRemoveEmptyRegions methods are obsolete. The latest version of Aspose.Words for Java use setCleanupOptions method to gets or sets a set of flags that specify what items should be removed during mail merge. I suggest you please read following documentation link for your kind reference.
https://docs.aspose.com/words/java/aspose-words-for-java/

ns1045:
Our mechanism to execute multiple regions is until now that we have multiple IMailMergeDataSources. We call for each one executeWithRegions(), so the new behaviour removes all regions which dont match the first datasource we use in executeWithRegions. So in next call auf executeWithRegions() with the next datasource all regions are already removed. So our question is: do/did you want that change of behaviour ?

It depends how to use the flags for cleanup procedure. In your case, please do not use MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS for each executeWithRegions call. Please set this Cleanup Option for last call of executeWithRegions as you have mentioned in your post.

https://reference.aspose.com/words/java/com.aspose.words/IMailMergeDataSource

// Create some data that we will use in the mail merge.

CustomerList customers = new CustomerList();
customers.add(new Customer("Thomas Hardy", "120 Hanover Sq., London"));
customers.add(new Customer("Paolo Accorti", "Via Monte Bianco 34, Torino"));
// Open the template document.
Document doc = new Document(MyDir + "MailingLabelsDemo.doc");
// To be able to mail merge from your own data source, it must be wrapped
// into an object that implements the IMailMergeDataSource interface.
CustomerMailMergeDataSource customersDataSource = new CustomerMailMergeDataSource(customers);
doc.getMailMerge().setCleanupOptions(MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS | MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS | MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS);
// Now you can pass your data source into Aspose.Words.
doc.getMailMerge().executeWithRegions(customersDataSource);
// This example creates a table, but you would normally load table from a database.
java.sql.ResultSet resultSet = createCachedRowSet(new String[] { "SerialNumber", "Item", "Quantity", "PricePerItem", "Amount" });
addRow(resultSet, new String[] { "1", "Milk", "1", "10", "10"});
com.aspose.words.DataTable Results = new com.aspose.words.DataTable(resultSet, "Result");
doc.getMailMerge().setCleanupOptions(MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS | MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS | MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS);
doc.getMailMerge().executeWithRegions(Results);
com.aspose.words.DataTable Results2 = new com.aspose.words.DataTable(resultSet, "table2");
doc.getMailMerge().setCleanupOptions(MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS | MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS | MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS | MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS);
doc.getMailMerge().executeWithRegions(Results2);
doc.save(MyDir + "MailMerge.CustomDataSource Out.doc");

ns1045:
I have another question concerning CleanupOptions: how to combine multiple cleanup options like removeEmptyParagraphs, removeUnusedRegions, removeUnusedFields… ?

You can combine multiple cleanup options as shown below:

doc.getMailMerge().setCleanupOptions(MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS | MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS | MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS | MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS);

Hi Tahir,

thanks a lot for your really good & interesting response. That will help us a lot!

Cheers

Hi Jens,

Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

Hi Tahir,

I found a problem which still exists: if there is a region mutliple used in one template only the first one will be executed with executeWithRegions(). So I call executeWithRegions() so many times until all regions with the id are executed.
But with the behaviour change of setCleanupOptions(MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS) I have a problem: I dont know how many times the region is used in the template. So I dont know when it is the last call of executeWithRegions() so I could then call setCleanupOptions() with REMOVE_UNUSED_REGIONS.

Do you have any suggestion how to solve the problem? Do you know if that feature (multiple execution of same region) would be implemented and in which version? I posted that ‘problem’ about two years ago already. I think Alex told me that that feature will be added to your todo list.

Thank you very much!
Cheers

Hi Jens,

Thanks for your inquiry. In your scenario, please call the setCleanupOptions just before saving the output document.

It would be great if you please share some detail about your mail merge structure. Please attach your MS Word template here for our reference. I will check your scenario with your template document and provide you more information.

Hi Tahir,

I found that the problem gets bigger than I expected. We are using regions with same TableName as subregions and mainregions in one document. So I cant find a correct point in time to set the cleanup option removeEmptyRegions.
I made a few junit tests with corresponding template documents and expected output values. You’ll find these tests in the attached zip file. I tried to create one test for each usecase we have.

Thanks a lot!
Cheers

Hi Jens,

Thanks for sharing the documents and code. I have worked with your code and documents and I suggest you please call MailMerge.setCleanupOptions just before last call of MailMerge.executeWithRegions method as shown in following code snippet.

The MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS specifies whether paragraphs that contained mail merge fields with no data should be removed from the document. When this option is set, paragraphs which contain region start and end merge fields which are otherwise empty are also removed.

Document document = new Document(MyDir + "inputMultipleHierarchicRegions.doc");

Map <String, IMailMergeDataSource> subRegions = new HashMap <String, IMailMergeDataSource> ();
final IMailMergeDataSource subRegion = getRegionDataSource("REGION1", "SubRegion", null);
// make sure inner regions returns fals on moveNext()
subRegion.moveNext();
subRegions.put("REGION1", subRegion);
document.getMailMerge().executeWithRegions(getRegionDataSource("REGION2", null, subRegions));
document.getMailMerge().executeWithRegions(getRegionDataSource("REGION2", null, null));
document.getMailMerge().executeWithRegions(getRegionDataSource("REGION1", "MainRegion", null));
document.getMailMerge().setCleanupOptions(MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS | MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS);
document.getMailMerge().executeWithRegions(getRegionDataSource("REGION1", "MainRegion", null));

Hi Tahir,

thats exactly what I do in testMultipleHierarchicRegions() and which especially does NOT work.

We want that subregions (executed by child-datasources) are removed if the child-datasource returns fals on moveNext(). Otherwise the subregions will retain and be executed by next maindatasource with same ID. I wanted to express this with my testcases. I’m sorry if I did not express this very well. Did you run these tests? If you will run this tests you will see what is expected and what mailmerge produces.

Thanks a lot! Cheers

Hi Jens,

Thanks for your inquiry. In case you are using an older version of Aspose.Words, I would suggest you please upgrade to the latest version (v13.1.0) from here.

I tested your code, The behavior of Aspose.Words is correct in output documents. Please see my comments in following code snippet. I am saving the the document after each executeWithRegions. Please check the output documents and see the behavior of Aspose.Words.

public void testMultipleHierarchicRegions() throws Exception
{
    // AsposeLicense.checkLicense();
    Document document = new Document(MyDir + "inputMultipleHierarchicRegions.doc");
    Map subRegions = new HashMap();
    final IMailMergeDataSource subRegion = getRegionDataSource("REGION1", "SubRegion", null);
    // make sure inner regions returns fals on moveNext()
    // subRegion.moveNext();
    subRegions.put("REGION1", subRegion);
    // Execute the first occourance of Region 1
    document.getMailMerge().executeWithRegions(getRegionDataSource("REGION1", "MainRegion", null));
    document.save(MyDir + "out1.doc");
    // Execute the second occourance of Region 1
    document.getMailMerge().executeWithRegions(getRegionDataSource("REGION1", "MainRegion", null));
    document.save(MyDir + "out2.doc");
    // Execute the Region 2 with sub Region 1. it depends on subRegion.moveNext()
    document.getMailMerge().executeWithRegions(getRegionDataSource("REGION2", null, subRegions));
    document.save(MyDir + "out3.doc");
    document.getMailMerge().setCleanupOptions(MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS |
        MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS);
    // Execute last Region 2
    document.getMailMerge().executeWithRegions(getRegionDataSource("REGION2", null, null));
    document.save(MyDir + "FinalOut.doc");
}

If you still face problem, please manually create your expected output Word document using Microsoft Word and attach it here for our reference. We will investigate as to how you want your final Word output be generated like. We will then provide you more information on this along with code.

Hi Tahir,

thanks for your reply. I use 13.1.0! But I’ll have a look at your code snippet.

Cheers

Hi Tahir,

attached you’ll find the exprected result document. I would like to know when I have to set the cleanup options to get this special result. In 10.7 the output was the same like expected.doc.

As second attachment you’ll find the source code of the same test for aspose version 10.7 which reproduce the expected output.

Though you could see the difference of behaviour between 10.7 and 13.1.0. In 13.1.0 I have no way to produce the same output like in 10.7.

Cheers,

Hi Jens,

Thanks for sharing the code/document. Please note that every new release of Aspose.Words comes up with some new features, enhancements in the existing features and bug fixes. We always encourage our customers to use the latest version of Aspose.Words as it contains newly introduced features, enhancements and fixes to the issues that are reported earlier.

Regarding empty paragraph issue, I like to share with you that the MailMergeCleanupOptions.RemoveContainingFields behavior is correct in output file generated with Aspose.Words for Java 13.1.0. Please red the detail of RemoveEmptyParagraphs and RemoveUnusedRegions as describe below:

RemoveEmptyParagraphs : Specifies whether paragraphs that contained mail merge fields with no data
should be removed from the document. When this option is set, paragraphs which
contain region start and end merge fields which are otherwise empty are also
removed.
RemoveUnusedRegions : Specifies whether unused mail merge regions should be removed from the document.

Moreover, you can remove empty Paragraph node from your document by using following code snippet. Hope this answers your query. Let us know if you have any more queries.

Document doc = new Document(MyDir + "in.doc");

RemoveEmptyParagraph parRem = new RemoveEmptyParagraph();
doc.accept(parRem);
doc.save(MyDir + "out.doc");
private class RemoveEmptyParagraph extends DocumentVisitor
{
    private Boolean _delete = false;
    public int visitParagraphStart(Paragraph paragraph) throws Exception
    {
        if (!paragraph.hasChildNodes())
            paragraph.remove();
        return VisitorAction.CONTINUE;
    }
}

Hi Tahir,

thank you for your reply and tipps. I will try to explain our problem again.

Current circumstances:
- Aspose version 13.1.0
- template with multiple regions and subregions which can have the same ID

So we need to control whether a -->subregion<-- ist removed if it is empty. If a subregion is empty in our case it should be always removed.
In some of our cases subregions and mainregions have the same ID, so if the subregions will remain if they return false, they will be executed with next call of executeWithRegions with the correspondent main datasource.
Example: The subregion has the ID “region1”. It is under “region2”. Region2is executed, returns true, but the childdatasource (with ID “region1”) returns false. So the subregion “region1” remains. So on next call executeWithRegions() with a (main)datasource with ID “region1” the subregion will be replaced with data of main-region 1.

If that behaviour is correct, as you say, then we need a second flag to remove subregions if they are empty.
But I think subregions should be always removed if they are empty. Yes it is a special case when subregions and mainregions have the same ID but this kind of templates worked with 10.7 and our customizers depend on them.

Please run the given tests again and look at the output of version 10.7 and at the output of 13.1.0.
In 10.7 the subregion “region1” under “region2” is correctly removed. In 13.1.0 the subregion “region1” under “region2” remains (because the CleanupOptions are not set) and is executed with the next call of executeWithRegions(datasource_of_region1). So incorrect data is inserted…

FYI: I tried all possible combinations of REMOVE_* flags with no success.

Thank you again. Cheers

Hi Jens,

Thanks for sharing the details. I tested the scenario with old version (10.7) of Aspose.Words for Java. The setRemoveEmptyRegions method do not work correctly according to correct behavior. MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS removes all unused mail merge regions from document.

ns1045:
Example: The subregion has the ID “region1”. It is under “region2”. Region2is executed, returns true, but the childdatasource (with ID “region1”) returns false. So the subregion “region1” remains. So on next call executeWithRegions() with a (main)datasource with ID “region1” the subregion will be replaced with data of main-region 1.

As per my understanding, you want to delete REGION1 from following region example if child data source is null/empty. In this case, you need to write your own cleanup method to remove such region and call MailMerge.setCleanupOptions just before last call of MailMerge.executeWithRegions method. Hope this answers your query. Please let us know if you have any more queries.

«TableStart:REGION2»
Region 2 Line blubba
«TableStart:REGION1»
Region 1 Inner: Line «FIELD1»
«TableEnd:REGION1»
«TableEnd:REGION2»

Hi Tahir,

thank you very much for your support!

Would it be possible that you provide, as a new feature, a control possibility for hide empty subregions?
We have, as I said, templates in productive use of our customizers which depend on the behaviour like it is in aspose version 10.7. So we would be very happy if aspose could integrate such a new feature in near future.

Thanks a lot again!

Kind regards

Hi Jens,

Thanks for your inquiry. You can remove specific fields from your template document by using following method.

private static void removeField(FieldStart fieldStart) throws Exception
{
    Node currentNode = fieldStart;
    boolean isRemoving = true;
    while (currentNode != null && isRemoving)
    {
        if (currentNode.getNodeType() == NodeType.FIELD_END)
            isRemoving = false;
        Node nextNode = currentNode.nextPreOrder(currentNode.getDocument());
        currentNode.remove();
        currentNode = nextNode;
    }
}

Moreover, please use the following code snippet to remove specific Region from document. You may use the same approach to remove specific sub regions. Hope this helps you. Please let us know if you have any more query.

Document doc = new Document(MyDir + "in.doc");
DocumentBuilder builder = new DocumentBuilder(doc);
ArrayList nodes = new ArrayList();
String removeRegion = "REGION1";
Boolean isExists = false;
for (FieldStart fStart: (Iterable <FieldStart> ) doc.getChildNodes(NodeType.FIELD_START, true))
{
    String fieldcode = getFieldCode(fStart);
    if (fieldcode.contains("TableStart:" + removeRegion))
    {
        isExists = true;
    }
    if (isExists == true)
        nodes.add(fStart);
    if (fieldcode.contains("TableEnd:" + removeRegion))
    {
        isExists = false;
        break;
    }
}
for (FieldStart fStart: (Iterable <FieldStart> ) nodes)
{
    removeField(fStart);
}
doc.save(MyDir + "out.docx");
private static String getFieldCode(FieldStart fieldStart) throws Exception
{
    StringBuilder builder = new StringBuilder();
    for (Node node = fieldStart; node != null && node.getNodeType() != NodeType.FIELD_SEPARATOR &&
        node.getNodeType() != NodeType.FIELD_END; node = node.nextPreOrder(node.getDocument()))
    {
        // Use text only of Run nodes to avoid duplication.
        if (node.getNodeType() == NodeType.RUN)
            builder.append(node.getText());
    }
    return builder.toString();
}