Mail Merge takes a very long time

I have a web project that needs to merge up up to 100 rows of data into a merge. There are 31 potential merge fields, but normally only 2 or 3 of those fields are used. The final merged document can be up to 500 pages long.

I have a few merge fields that may or may not be HTML and 2 fields that can be image fields.


The merge process seems to take ~5 minutes to complete and I was wondering if there was anything I can do to speed the process.

  1. Would cutting down the number of available merge fields have a noticeable impact on the merge time?
  2. Would removing the image fields have a noticeable impact on the time?
  3. Are there any other suggestions that might speed the process up?

My code is reasonably simple:
public static byte[] MergeFields(byte[] sourceDoc, DataTable dt)
{
byte[] rVal;
using (var wordByteStream = new MemoryStream(sourceDoc))
{
Document wordDoc = new Document(wordByteStream);
wordDoc.MailMerge.CleanupOptions = Aspose.Words.Reporting.MailMergeCleanupOptions.RemoveUnusedFields;
wordDoc.MailMerge.FieldMergingCallback = new HandleMerge();
wordDoc.MailMerge.Execute(dt);
rVal = wordDoc.SaveToByteArray();
}
return rVal;
}
public class HandleMerge : IFieldMergingCallback
{
void IFieldMergingCallback.FieldMerging(FieldMergingArgs e)
{
// All merge fields that expect HTML data should be marked with some prefix, e.g. ‘html’.
if (e.FieldValue != null && !string.IsNullOrEmpty(e.FieldValue.ToString()) && e.FieldValue.ToString().ContainsHtml())
{
// Insert the text for this merge field as HTML data, using DocumentBuilder.
DocumentBuilder builder = new DocumentBuilder(e.Document);
builder.MoveToMergeField(e.DocumentFieldName);
builder.InsertHtml((string)e.FieldValue);

        <span style="color:green;">// The HTML text itself should not be inserted.</span>
        <span style="color:green;">// We have already inserted it as an HTML.</span>
        e.Text = <span style="color:#a31515;">""</span>;
    }
}

<span style="color:gray;">///</span><span style="color:green;"> </span><span style="color:gray;"><summary></span>
<span style="color:gray;">///</span><span style="color:green;"> This is called when mail merge engine encounters Image:XXX merge field in the document.</span>
<span style="color:gray;">///</span><span style="color:green;"> You have a chance to return an Image object, file name or a stream that contains the image.</span>
<span style="color:gray;">///</span><span style="color:green;"> </span><span style="color:gray;"></summary></span>
<span style="color:blue;">void</span> <span style="color:#2b91af;">IFieldMergingCallback</span>.ImageFieldMerging(<span style="color:#2b91af;">ImageFieldMergingArgs</span> e)
{

    <span style="color:green;">// The field value is a byte array, just cast it and create a stream on it.</span>
    <span style="color:blue;">var</span> bytes = e.FieldValue <span style="color:blue;">as</span> <span style="color:blue;">byte</span>[];
    <span style="color:blue;">if</span> (bytes != <span style="color:blue;">null</span>) <span style="color:green;">//it is an image field and the value has bytes!</span>
    {
        <span style="color:#2b91af;">MemoryStream</span> imageStream = <span style="color:blue;">new</span> <span style="color:#2b91af;">MemoryStream</span>(bytes);
        <span style="color:green;">// Now the mail merge engine will retrieve the image from the stream.</span>
        e.ImageStream = imageStream;
    }
}

}



UPDATE:
I noticed another user reported better speeds when they merged documents individually and then combined the individual documents into a single document (Free Support Forum - aspose.com) . I tried this and did infact cut my merge time down to somewhere closer to 3 minutes. This is still longer than I would like though.

New psudo code:

foreach (var recommendation in recommendations)
{
file = null;

               innerFileName = <span style="color:#a31515;">"{0} - {1} Recommendation {2}"</span>.FormatString(recommendation.JudgeRequest.RequesterName, recommendation.JudgeRequest.JudgeName, recommendation.Id.ToString());

                   innerFileName += <span style="color:#a31515;">".docx"</span>;
                   <span style="color:blue;">var</span> replaceDict = GetReplaceDictionary(recommendation, instSigBytes);


                       <span style="color:blue;">var</span> temp = _Db.RecommendationMergeDocuments.SingleOrDefault(w => w.Id == recommendation.RecommendationMergeDocumentId.Value);
                       <span style="color:blue;">if</span> (temp != <span style="color:blue;">null</span> && temp.TemplateBytes != <span style="color:blue;">null</span>)
                       {
                           <span style="color:blue;">var</span> doc = <span style="color:#2b91af;">AsposeWord</span>.MergeFields(temp.TemplateBytes, replaceDict);
                           recommendation.RecommendationTextDoc = doc;
                       }
                  

                   file = recommendation.RecommendationTextDoc;
               }
               <span style="color:blue;">if</span> (file != <span style="color:blue;">null</span> && file.Length > 0)
               {
                   usedIds.Add(recommendation.Id);
                   filesToMerge.Add(innerFileName, file);
               }
           }

var c = _Db.SaveChanges();

file = AsposeWord.CombineDocuments(filesToMerge.Values.ToList());// Core.Extensions.ZipArchiveExtensions.CreateZipBytes(filesToMerge);

public static byte[] MergeFields(byte[] sourceDoc, Dictionary<string, object> mergeData)
{
byte[] rVal;
using (var wordByteStream = new MemoryStream(sourceDoc))
{
Document wordDoc = new Document(wordByteStream);
wordDoc.MailMerge.CleanupOptions = Aspose.Words.Reporting.MailMergeCleanupOptions.RemoveUnusedFields;
wordDoc.MailMerge.FieldMergingCallback = new HandleMerge();
wordDoc.MailMerge.Execute(mergeData.Keys.ToArray(), mergeData.Values.ToArray());
rVal = wordDoc.SaveToByteArray();
}
return rVal;
}

   <span style="color:blue;">public</span> <span style="color:blue;">static</span> <span style="color:blue;">byte</span>[] CombineDocuments(<span style="color:#2b91af;">List</span><<span style="color:blue;">byte</span>[]> docs)
   {
       <span style="color:blue;">byte</span>[] rVal;
       <span style="color:#2b91af;">Document</span> combinedDoc = <span style="color:blue;">null</span>;
       <span style="color:#2b91af;">Document</span> sourceDoc = <span style="color:blue;">null</span>;
       <span style="color:blue;">foreach</span> (<span style="color:blue;">var</span> docBytes <span style="color:blue;">in</span> docs)
       {
           <span style="color:blue;">using</span> (<span style="color:blue;">var</span> wordByteStream = <span style="color:blue;">new</span> <span style="color:#2b91af;">MemoryStream</span>(docBytes))
           {
               <span style="color:blue;">if</span> (combinedDoc == <span style="color:blue;">null</span>)
               {
                   combinedDoc = <span style="color:blue;">new</span> <span style="color:#2b91af;">Document</span>(wordByteStream);
               }
               <span style="color:blue;">else</span>
               {
                   sourceDoc = <span style="color:blue;">new</span> <span style="color:#2b91af;">Document</span>(wordByteStream);
                   combinedDoc.AppendDocument(sourceDoc, <span style="color:#2b91af;">ImportFormatMode</span>.KeepSourceFormatting);
               }
           }
       }
       rVal = combinedDoc.SaveToByteArray();
       <span style="color:blue;">return</span> rVal;

   }</pre><br> </div>

The primary problem that I was running into was that there were images in the HTML merge fields. Even though it was the same image it was downloading it for every document, which added the majority of the time to the merge. A 100 record 2 page merge takes around 3.5 minutes.

If i remove the images it goes down to around 15 seconds for the same merge.

My final solution was to give my users a smaller set of image merge fields whose names are prefixed with Image: and I put the bytes for said image in my merge data. With the Image: prefix on the merge field name it Aspose routes it through ImageFieldMerging and it uses the passed bytes instead of downloading the image.

Assuming my users don’t add images into the merged HTML and just use the specified image merge fields, then the merge stays in the 15-20 second time frame.

Hi Logan,

Thanks for your inquiry. It would be great if you please share following detail for investigation purposes.


  • Please attach your input Word document.
  • Please

    create a standalone/runnable simple application (for example a Console
    Application Project
    ) that demonstrates the code (Aspose.Words code) you used to generate
    your output document

Unfortunately,
it is difficult to say what the problem is without the Document(s) and
simplified application. We need your Document(s) and simple project to
reproduce the problem. As soon as you get these pieces of information to
us we’ll start our investigation into your issue.