Vikas's huge doc

Hi Roman,
As you requested attached is the sample template document with some merge fields, the output document with your ASPOSE.WORD(latest version released today), the stored proc to populate the document (in .sql and .txt format) and the actual o/p data of the stored proc in a CSV file. The SP returns 2 tables. First table returns the mergefields data and the second table return data indicating which section has to be deleted.
Attached also is the web form code behind and a utility class in a text file. I return the second table in the stored proc just for the delete logic of the numbered sections.

Having the following problem:
a>With the numbered section I am having a problem. The section doesn’t
a>get
deleted as numbered. The numbering started from 1 till 6 in the doc. Please let me know what am I doing wrong.
b>The merge field in the first section doesn’t get populated though
b>there
was a value.

Hi Vikas,

I can report some progress.

  1. The field did not get populated because it is misspelled. You really need to right click on the field and select Toogle Field Codes command and check the field code. It does not have “1” at then end whereas the field name in the SP has. What you see in <> is not really the field name but some temporary field value that Word has for that field. You can also right click on the field and select Update Field, you will see it does not contain “1” anymore.

  2. Your table headings had numbers hardcoded in them such as “11. Text blah blah”. If you want to take advantage of automated numbering you really need to switch them all to be items of the list. I did this for you in the whole document by selecting table heading and clicking Numering button on the toolbar in Word. This created a numbered list from 1 to about 100 items. Now if you delete a section (in Word) you will see that the rest of the items renumbering automatically.

Unfortunately, if I delete a section using Aspose.Word the list does not seem to auto renumber and furthermore if I go to numbering properties Word crashes. It means this particular numbered list is a bit too complex for Aspose.Word to handle. I will try to fix this over the weekend. In the worst case we will need to use mail merge fields to the numbers instead of the numbered list, at least temporarily.

Hi Roman,
Please let me know if you need more info … I am waiting on your feedback. Also as you can notice from my Stored proc, We hit many tables to get the data. In the version you will be releasing on Feb 23 can multiple tables be fed to the MergeField.Execute. In that case we won’t have to create a tamp table in the stored proc. I was also interested as to how we will be able to define the merge fields in the document in case of the multiple table scenario.

As you can see my design on this is waiting for your feedback !!

I think we should continue the “multiple calls to MailMerge” approach, this will let you finish hopefully even before 23rd Feb.

I looked at the code and the SP and I found it does not look bad at all. The fact that you return multiple tables instead of one huge view with many columns is good for several reasons:

  1. You are in control and can decide whether you need one SP or multiple. You might wish to keep single SP to minimize number of database trips, but for code simplicity and maintenance you can switch to one SP - one table approach if you want to.

  2. When Aspose.Word that supports true multitable mail merge comes out and you want to use that feature it will be easy. You already have a dataset with multiple tables. All you will need to do is to remove 7 calls to MailMerge, leave just one. You will also need to add special marking into the template document to tell Aspose.Word which field belongs to which table.

Hi Roman,

I tried using the other way that you mentioned which is available in the current version as follows but it gives a exception “Object Reference not set” when it tries to do obj.Save … The stored proc now returns 7 tables …

Dim mergeDoc0 As Document = mainDoc.MailMerge.Execute(objDS.Tables(0))
Dim mergeDoc1 As Document = mergeDoc0.MailMerge.Execute(objDS.Tables(1))
Dim mergeDoc2 As Document = mergeDoc1.MailMerge.Execute(objDS.Tables(2))
Dim mergeDoc3 As Document = mergeDoc2.MailMerge.Execute(objDS.Tables(3))
Dim mergeDoc4 As Document = mergeDoc3.MailMerge.Execute(objDS.Tables(4))
Dim mergeDoc5 As Document = mergeDoc4.MailMerge.Execute(objDS.Tables(5))
Dim mergeDoc6 As Document = mergeDoc5.MailMerge.Execute(objDS.Tables(6))
Dim mergeDoc7 As Document = mergeDoc6.MailMerge.Execute(objDS.Tables(7))

'Merge Fields population block

'Giving an error at the below method call …
mergeDoc7.Save("test.doc", SaveFormat.FormatDocument, SaveType.OpenInWord, Me.Response)

Hi Vikas,

I tried in a slightly simpler setup and it works okay for me.

Hmm… Did you say 7 tables? But you access 8 tables here. I reckon it’s the eight table at index 7 is causing you the problem.

When you get the thing working let me know what the performance and memory consumption is like. Essentially you’ve got 30 pages document opened 8 times which is probably a good stress test for Aspose.Word and your IIS server too!

Hi Roman,
It is actually 8 tables … made a typo earlier and the error is at the Save call I mentioned earlier .
Maybe the stack trace below will assist you more in debugging …

Object reference not set to an instance of an object.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.NullReferenceException: Object reference not set to an instance of an object.

Source Error:

Line 95: 'Merge Fields population block
Line 96:
Line 97: mergeDoc7.Save(“Test.doc”, SaveFormat.FormatDocument, SaveType.OpenInWord, Me.Response)
Line 98: Response.End()

Source File: C:\Program Files\Aspose\Aspose.Word\Demos\Aspose.Word.Demos.VB.WebForms\test.aspx.vb Line: 97

Stack Trace:

[NullReferenceException: Object reference not set to an instance of an object.]
Aspose.Word.Document.a()
Aspose.Word.Document.Save(Stream stream, SaveFormat fileFormat)
Aspose.Word.Document.Save(String fileName, SaveFormat fileFormat, SaveType saveType, HttpResponse response)
Aspose.Word.Demos.VB.WebForms.Test.Button1_Click(Object sender, EventArgs e) in C:\Program Files\Aspose\Aspose.Word\Demos\Aspose.Word.Demos.VB.WebForms\Test.aspx.vb:97
System.Web.UI.WebControls.Button.OnClick(EventArgs e)
System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePostBackEvent(String eventArgument)
System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl, String eventArgument)
System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData)
System.Web.UI.Page.ProcessRequestMain()

Thanks,
Vikas

Hi Vikas,

Can you save the dataset as XML file and email to me so I will be able to run exactly the same code. You should be able to do so using DataSet’s methods.

Hi Roman,
I think I figured it out and now the multiple MailMerge.Execute is working great !! Also the updatefield solved the field not getting populated problem!!

The XML that I saved out of dataset indicated the problem … though the SP returned 8 tables … the last table had no rows. So a check has to be made for any rows in the datatable before calling the execute method.

Below is the snippet and the helper function …

Dim app As Word = New Word()
Dim mainDoc As Document = app.Open(Server.MapPath("/Documents/Test.doc"))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(0))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(1))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(2))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(3))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(4))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(5))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(6))
mainDoc = MailMergeExecute(mainDoc, objDS.Tables(7))
mainDoc.Save("Test.doc", SaveFormat.FormatDocument, SaveType.OpenInWord, Me.Response)

'Helper Function
Public Function MailMergeExecute(ByVal objDoc As Document, ByVal objTable As DataTable) As Document
If objTable.Rows.Count > 0 Then
Return objDoc.MailMerge.Execute(objTable)
Else
Return objDoc
End If
End Function

Thanks,
Vikas

Aspose.Word 1.1.3 is out and it no longer throws an exception when merging with empty table, but you probably should keep your code because otherwise you will get a copy of the source document in your destination document even if there are no records.

Also I fixed the issue with list numbering. Word no longer crashes and the numbering is preserved across sections. It is a bit delicate to number across sections, I’ve done this in your document and email to you. Basically, I found you need to number items in one section and then insert a section break between them then all works fine.

Here is what happens during mail merge.

  1. The original document is opened and loaded into memory only once. In memory the document is represented in some form of nodes (composite).
  2. When mail merge is executed it copies all the nodes of the original document into a new document.
  3. Mail merge goes through the nodes of the new document, finds nodes that represent mail merge fields and replaces text in them.
  4. If the table contains more rows, mail merge repeats over (copies all original nodes and appends to the new document) until all rows are processed.
    Therefore you can see that if you call MailMerge several times it essentially copies all document in memory into new one. This is a consequence of the fact that current MailMerge was designed to merge fields from one table only.

When next version 1.2 that is designed to merge fields from multiple tables (DataSet) comes out, then those extra copies will not be made.

In v1.2 to specify table names inside the document we plan to use the following approach:

  1. To identify start of table, insert a mail merge field with name TableStart:XXX where XXX is your table name or index in the DataSet.
  2. Insert mail merge fields with field names of the table as usual.
  3. To identify end of table, insert a mail merge field with name TableEnd:XXX where XXX is your table name or index in the DataSet.
  4. TableStart/TableEnd could be nested inside each other and the nesting must be compatibly with relationships between tables established in the DataSet.
  5. We might need to support fully qualified field names such as MyTable.MyFieldName so you can refer to some outer table from within an inner nested table.

If you see an alternative or better way please suggest.

Hi Roman,
TableStart:XXX and TableEnd:XXX should work well …
As for the qualified field names it will be good if table can be referenced by index also.
Something like Table(0).FieldName so that we don’t have to explicitly name a datatable in the code behind.

Will there be a support for documents which have unique field names all over the document that the TableStart:XXX & TableEnd:XXX and the qualified field names won’t be required ?
This will be good for big document templates which already exist and hence there won’t be a need to revisit the standard template.

The new standard of TableStart:XXX & TableEnd:XXX and the qualified field names is good only for new templaes being made …

Please let me know your thoughts on this…

Thanks,
Vikas

I understand your situation with existing template.

I cannot think of any elegant generic solution that can be embedded into Aspose.Word for this yet. Technically speaking in your case TableStart/TableEnd should not be used and Aspose.Word should somehow look into all tables to find a field just by name. This does not sound nice.

I think you could just write a simple generic function that takes all tables in your DataSet, combines all their columns into one table and executes MailMerge on that one data table. In this case we solve the problem of multiple documents in memory using current version and you don’t have to wait for the new version to come out and I don’t have to try to fit something that does not seem to fit into Aspose.Word.

The function could look like this (just an idea, pseudocode).

// I assume all your tables have just one row.
public static DataTable JoinAllTables(DataSet srcDataSet)
{
    DataTable resultTable = new DataTable();

    // First pass add all tables’ columns to the result table.
    foreach (DataTable table in srcDataSet.Tables)
    {
        // Add all columns to the resulting table
        foreach (DataColumn column in table.Columns)
            resultTable.Columns.Add(column)
    }

    // Second pass add all data to the result table.
    foreach (DataTable table in srcDataSet.Tables)
    {
        // TODO
    }

    return resultTable;
}

This sounds good …
I hope in the new version with just one table being passed to MailMerge.Execute we won’t have to explicitly specify TableStart.0 & TableEnd.0 …

Thanks,
Vikas

MailMerge.Execute(DataTable) signature will remain just for that case.
MailMerge.Execute(DataSet) will be added.

Hi Vikas,

Just wanted to check if you managed to get your solution working and still using Aspose.Word. Do you need anything else from me?

Hi Romank,
Yes, We are going ahead with ASPOSE …
Just wanted to know when will the dataset with multiple tables as discussed earlier will be released?

Thanks,
Vikas

Although “beta” release with repeatable regions planned for 23rd Feb was delayed due to us bringing work on office drawing objects forward the planned release date for Aspose.Word 1.2 is still 1st March.

Just let me know why you are still planning on inseting TableStart: and TableEnd: fields? If they are for this big document we’ve been working on - I thought you liked to idea to build a single table programmatically from several tables. Or maybe they are just for your other documents?