A bit of trouble with RemoveEmptyParagraphs and DeleteFields

Greetings –

I’m having a bit of trouble will RemoveEmptyParagraphs + DeleteFields(). I’ve seen other posts on this topic and saw a message from Dimitry in which he suggested that in addition to that property and method, you should add the following code:

NodeList fields = doc.SelectNodes("//FieldStart");
foreach (FieldStart field in fields)
{
if (field.FieldType == FieldType.FieldMergeField)
field.ParentNode.Remove();
}

That works most of the time, however, occasionally I'll get an error while going through this loop that says essentially "Cannot remove because there is no parent".

Can you describe why a field would not have a ParentNode? Would the loop be improved if I checked for a ParentNode? Maybe this:

NodeList fields = doc.SelectNodes("//FieldStart");
foreach (FieldStart field in fields)
{
if (field.FieldType == FieldType.FieldMergeField &&
field.ParentNode != null )
field.ParentNode.Remove();
}

Here’s the actual code I’m using, plus the template and the output.

doc.MailMerge.RemoveEmptyParagraphs = true;
doc.MailMerge.Execute(dv);
doc.MailMerge.DeleteFields();

// This is the Aspose way of removing paragraphs that have empy fields
NodeList fields = doc.SelectNodes("//FieldStart");
foreach (FieldStart field in fields)
{
if (field.FieldType == FieldType.FieldMergeField &&
field.ParentNode != null)
field.ParentNode.Remove();
}


In the output, you’ll see a blank line between the contact name and address. (This is where the Company name field was located.) Obviously, I would like to remove the blank line between the name and the address.

If you have several field start nodes that belong to the same paragraph, then you will remove this paragraph when iterating over first field start and when coming to a second field start the paragraph is already removed, which means that it was excluded from the child nodes of its parent. And so you get this message. Means you cannot remove the one and the same node twice. To avoid that check field.ParentNode.ParentNode != null.

Hope this helps,

That’s exactly what I needed to know. Thanks very much.

Michael

One more question, Vladimir:

With this improved loop, is it necessary still to call the DeleteFields() method? Won’t all of the remaining fields be eliminated?

Also, is there any point in setting RemoveEmptyParagraphs? Seems like both of those could be removed.

Thanks again for your thoughts.

DeleteFields also removes NEXT fields. So it must be expedient to call the loop first and then call DeleteFields to clean up what is remained. Please note that the loop removes all paragraphs that contain merge fields. It must be a good idea to check if the paragraphs are in fact empty or contain only whtespace characters. so I would collect this paragarphs first in the loop, e.g. adding them to ArrayList. Then execute DeleteFields. Then iterate over paragraphs checking if they contain any characters besides whitespaces so that only paragraphs without text content will be removed.

RemoveEmptyParagraphs works in the following way. After each merge field is merged with empty data it checks if the containing paragraph has some content and removes it if it is empty. It does not differentiate however between whitespaces and text. So if the paragraph contains at least one space charactwer besides mereg field it does not get removed. We will improve this in the futuire but right now it works this way. Hope this clarifies things a bit. Please let me know if you have further questions.

Best regards,

I didn’t quite understand what you meant in this posting. Do you have an example?

Since you suggest calling DeleteFields before checking the paragraph contents, it seems like that would result in all empty paragraphs being removed – even ones you want (e.g., blank lines between paragraphs.) Of course, you only want to remove a paragraph that contains an empty MergeField and has no other text, right?

Seems like this ability to remove fields (if the parent node paragraph is empty) would be a high priority for future development, at least I hope it is. Anyway, a further example would be really helpful.

It is hard to write universal example that will fit all possible case. Please attach your document before and after merge and point out the paragaraphs that you would like to get rid of. I will then compose the code example that will fit your case.

That would be greatly appreciated. I’ve attached the template and some output in a zip file.

Notice how the address block has <<Contact_Title>>. Some records don’t have a title, so this line remains blank in the sample output.

If I use the modified loop we dicussed earlier, it removes the blank line for <<Contact_Title>> just fine. However, if another field, say <<Opportunity_Amount>> (which is embedded in the main paragraph) is blank, the loop (as you would expect) removes the entire paragraph, which obviously isn’t desired.

Also, notice the blank lines between paragraphs in the template.

I tried this, but it didn’t work right:

NodeList fields = doc.SelectNodes("//FieldStart");
foreach (FieldStart field in fields)
{
if (field.FieldType == FieldType.FieldMergeField &&
field.ParentNode.ParentNode != null &&
field.ParentNode.GetText().Trim() == “”)
field.ParentNode.Remove();
}

Thanks again for taking a look.

I’ve played with this some more and I think I can isolate the problems to some more fundamental issues:

(1) I think that RemoveEmptyParagraphs may not be working as I’m expecting. I’ve checked the DataTable several times and found that “blank” fields definitely contain empty strings. For example, my Contacts data table has a field for CONTACT_TITLE and it contains either an empty string or a string value – no null values.

When I execute the mail merge, I set RemoveEmptyParagraphs = true before Execute(dv). The <<Contact_Title>> field is apparently filled with the empty string, but the paragraph isn’t removed, resulting in the blank line in the address block. (You can see this in the attached file from the previous forumn entry.)

Since the field gets filled by Aspose.Words, the loop to check for “//FieldStart” never finds that paragraph, so the loop seems ineffective for fields that get filled.

(2) The second issue has to do with fields that never get filled, ie, fields that don’t exist within the DataTable. Since these remain after the mail merge has been executed, the loop to check for “//FieldStart” finds the fields and can do something about them. I’m still working on a solution for checking to see if the ParentNode has any text other than the MergeField and, if so, to remove the paragraph.

So I think I have two problems.

Michael

I have checked up your case and the solution seems to be very simple. As far as I can see the problem with empty paragraphs removal is caused by the fact that the values for empty fields are nulls, not empty strings, and this is a bit different. Such fields just don't get merged and, consequently, RemoveEmptyParagraphs setting does not work for them. To merge them forcefully with empty string data and thus make RemoveEmptyParagraphs work you should use MergeField event. The code should be as follows:

doc.MailMerge.RemoveEmptyParagraphs = true;

doc.MailMerge.MergeField += new MergeFieldEventHandler(MergeWithCleanEmptyParagraphs);

doc.MailMerge.Execute(...);

doc.MailMerge.DeleteFields();

And the code for event handler is simple:

private void MergeWithCleanEmptyParagraphs(object sender, MergeFieldEventArgs e)

{

if (e.FieldValue == null)

e.Text = "";

}

That's all. No complicated field start searches Smile [:)]

Hope this helps,

That solution would have been great, if it wasn’t for the fact that I’m doing several merges at one time. With this solution, the merge fields that would have been used in the next merge operation get set to the empty string and removed before the next merge can process them.

Really close though!

I’m trying to think of a way during the MergeWithCleanEmptyParagraphs handler to see if the data source contains the field in question. If not, it should not do anything with it (so the next merge can handle it). If it does AND the value is null, then return the empty string.

It doesn’t look like the data source is accessible via MeregeFieldEventArgs, but maybe I haven’t discovered it yet.

We’re so close!

Vladimir –

I changed the logic of the code a bit, re-arranged a few things and now the code you provided earlier today did the trick.

Thanks for your help.

Glad you've made it work Smile [:)]

Still, the problems that you have raised indeed exist. We are discussing now how to change our current API a bit to make it more convenient and to avoid the problems with empty data and paragraphs removal.

Best regards,

The issues you have found earlier (filed as WORDSNET-560) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.