XML mail merge creates HUGE file

I am having a problem with XML mail merge, attached is a ZIP archive with the template file, the XML data and the resulting document.

The problem is, the result file is really HUGE, while by the data fed into it, it really shouldn’t be.

If it helps, the merge is performed like this:

private static void merge(Document document, InputStream xmlStream, boolean withRegions) throws Exception {
	// imposto trigger per merge immagini e HTML, e abilito sintassi Mustache
	com.aspose.words.MailMerge mm = document.getMailMerge();
	mm.setFieldMergingCallback(new ImageMerge());
	mm.setTrimWhitespaces(true);

	// creo il dataset
	com.aspose.words.net.System.Data.DataSet dataSet = new com.aspose.words.net.System.Data.DataSet();
	dataSet.readXml(xmlStream);

	// eventuale merge delle regioni
	if (withRegions) {
		mm.setMergeDuplicateRegions(true);

		// imposto pulizia campi non usati
		mm.setCleanupOptions(
			// rimozione regioni non compilate
			MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS
			// rimozione righe di tabella vuote
			| MailMergeCleanupOptions.REMOVE_EMPTY_TABLE_ROWS
			// rimozione paragrafi vuoti
			| MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS
		);

		// eseguo merge
		mm.executeWithRegions(dataSet);
	}

	// imposto pulizia campi non usati
	mm.setCleanupOptions(
		// rimozione campi non compilati
		MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS
		// rimozione campi innestati
		| MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS
		// rimozione paragrafi vuoti
		| MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS
	);

	// eseguo merge
	mm.execute(dataSet.getTables().get(0));
}

I think the error might be related to mm.setMergeDuplicateRegions(true);, in fact if I wrap the region modulo in another region, and change the XML accordingly, it works properly; however, by viewing the examples in the docs, this should work correctly too.

xml_mail_merge_error.zip (2.0 MB)

@mtassinari,

Thanks for your inquiry. Please use the MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS and MailMergeCleanupOptions.REMOVE_STATIC_FIELDS cleanup options to reduce the output file size.

mm.setCleanupOptions(
        // rimozione campi non compilati
        MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS
                // rimozione campi innestati
                | MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS
                // rimozione paragrafi vuoti
                | MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS
        | MailMergeCleanupOptions.REMOVE_UNUSED_REGIONS | MailMergeCleanupOptions.REMOVE_STATIC_FIELDS
);

mm.execute(dataSet.getTables().get(0));

I am sorry but I fail to see how that could help:

  • the region is used, so there is nothing to be remove
  • the static fields are necessary

@mtassinari,

Thanks for your inquiry. We have tested the scenario and have noticed that MailMerge.CleanupOptions does not remove IF field. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-15955. You will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

Please replace the IF field with its result using following code. Please get the code of FieldsHelper class from Aspose.Words for Java examples repository at GitHub.

mm.setCleanupOptions(
        // rimozione campi non compilati
        MailMergeCleanupOptions.REMOVE_UNUSED_FIELDS
                // rimozione campi innestati
                | MailMergeCleanupOptions.REMOVE_CONTAINING_FIELDS
                // rimozione paragrafi vuoti
                | MailMergeCleanupOptions.REMOVE_EMPTY_PARAGRAPHS
);

// eseguo merge
mm.execute(dataSet.getTables().get(0));

FieldsHelper.convertFieldsToStaticText(doc, FieldType.FIELD_IF);
doc.save(MyDir + "output.docx");

@mtassinari,

Thanks for your patience. It is to inform you that the issue which you are facing is actually not a bug in Aspose.Words. So, we have closed this issue (WORDSNET-15955) as ‘Not a Bug’.

Your template document contains table with row as mail merge region. Table cells contains IF fields with following pattern.

{ IF { MERGEFIELD condition } = 1 "{ MERGEFIELD result }" "{ MERGEFIELD result }" }

After first merge (without fields removing options) IF fields looks like this:

{ IF 1 = 1 "result" "{ MERGEFIELD result }" }

During second merge (with fields removing options) nested MERGEFIELD actually is not merged, because of parent IF field condition and is retained in the document. MS Word also behaves in the same way.

The output file size is big because output document contains N equal tables of N rows, where N = 112 (size of the data source). The tables are populated during second mail merge. You should avoid double merge with N records or use shared workaround in my previous post.