Merge Field Formatting

nutan · October 29, 2010, 2:34am

Hello there,

I am facing merge field formatting issue.
My source document contains placeholder codes that are replaced with merge fields at run time as:

_builder.MoveTo(currentNode);
_builder.InsertField(string.Format("MERGEFIELD {0}", mergeField), string.Format("«{0}»", mergeField));

But when mail merge is run, merge fields are not replaced with same formatting as was of placeholder codes in source document. For exp. In attached document needs to be replaced with a value with same formatting (Green+Bold) but what comes out in output is plain text with no formatting.
I read the documentation, it seems I need to apply MERGEFORMAT switch while setting up Merge Field at run time. Or Is there any other way to retain the formatting?

Please suggest how to apply MERGEFORMAT switch at run-time so that i can apply same Font settings as that of placeholder code.

Thank you!

adam.skelton · October 29, 2010, 4:53am

Hi there,
Thanks for your inquiry.
Yes the MERGEFORMAT switch is needed to retain the same formatting in the field during mail merge. I’m afraid currently using this will still not result in the correct behaviour. This is because the replacment field will be inserted with the formatting that the DocumentBuilder is currently set to. This will result in this same formatting being applied to the result when mail merge is executed, even with the switch set.
To achieve the correct behaviour you will need to insert the replacement field into the document and then copy the run nodes from the marker paragraph which contains the formatting you are looking to retain. Doing this should then result in the merged field having the same format as the marker text.
Please see the code below which achieves this:

DocumentBuilder _builder = new DocumentBuilder(doc);
// Placeholder finding logic here
Node currentNode = _builder.CurrentNode;
_builder.MoveTo(currentNode);
// Insert the replacment field into the document. Use the MERGEFORMAT switch to retain formatting during merging.
Field field = _builder.InsertField(string.Format(@"MERGEFIELD {0} \* MERGEFORMAT", mergeField));
// If marker node is a paragraph append all the children from the paragraph into the field result (before the field end).
// This should result in the merged data inheriting the formatting from the marker node.
if (currentNode.NodeType == NodeType.Paragraph)
{
    Paragraph para = (Paragraph)currentNode;
    foreach (Node node in para.ChildNodes.ToArray())
    {
        field.End.ParentParagraph.InsertBefore(node, field.End);
    }
}
else if (currentNode.NodeType == NodeType.Run)
{
    // If the current node is just a run insert it in the same place in order to produce the same behaviour.
    field.End.ParentParagraph.InsertBefore(currentNode, field.End);
}

If you have any troubles please feel free to ask.
Thanks,

nutan · October 29, 2010, 7:08am

Well, it doesn’t solve the formatting issue, instead at some places it removed the spacing between codes.

here is the code that I am using:

public ReplaceAction Replacing(ReplacingArgs e)
{
    // This is a Run node that contains either the beginning or the complete match.
    currentNode = e.MatchNode;

    // The first (and may be the only) run can contain text before the match, 
    // in this case it is necessary to split the run.
    if (e.MatchOffset > 0)
        currentNode = SplitRun((Run)currentNode, e.MatchOffset);

    _parent._builder = new DocumentBuilder((Document)e.MatchNode.Document);
    _parent._builder.MoveTo(currentNode);
    InsertMergeField(fieldName);

    // This array is used to store all nodes of the match for further removing.
    var runs = new ArrayList();
    // Find all runs that contain parts of the match string.
    var remainingLength = e.Match.Value.Length;
    while ((remainingLength > 0) &&
    (currentNode != null) &&
    (currentNode.GetText().Length <= remainingLength))
    {
        runs.Add(currentNode);
        remainingLength = remainingLength - currentNode.GetText().Length;

        // Select the next Run node.
        // Have to loop because there could be other nodes such as BookmarkStart etc.
        do
        {
            currentNode = currentNode.NextSibling;
        }
        while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
    }

    // Split the last run that contains the match if there is any text left.
    if ((currentNode != null) && (remainingLength > 0))
    {
        SplitRun((Run)currentNode, remainingLength);
        runs.Add(currentNode);
    }

    // Now remove all runs in the sequence.
    foreach (Run run in runs)
        run.Remove();

    // Signal to the replace engine to do nothing because we have already done all what we wanted.
    return ReplaceAction.Skip;
}

private void InsertMergeField(string mergeField)
{
    //_parent._builder.InsertField(string.Format("MERGEFIELD {0}", mergeField), string.Format("«{0}»", mergeField));
    Field field = _parent._builder.InsertField(string.Format(@"MERGEFIELD {0} * MERGEFORMAT", mergeField));
    //If marker node is a paragraph append all the children from the paragraph into the field result(before the field end).
    // This should result in the merged data inheriting the formatting from the marker node.
    if (currentNode.NodeType == NodeType.Paragraph)
    {
        Paragraph para = (Paragraph)currentNode;
        foreach (Node node in para.ChildNodes.ToArray())
        {
            field.End.ParentParagraph.InsertBefore(node, field.End);
        }
    }

    else if (currentNode.NodeType == NodeType.Run)
    {

        // If the current node is just a run insert it in the same place in order to produce the same behaviour.
        field.End.ParentParagraph.InsertBefore(currentNode, field.End);
    }
}

alexey.noskov · October 29, 2010, 1:03pm

Hi

Thank you for additional information. Please try using the following code to replace your placeholder with merge fields:

[Test]
public void Test001()
{
    Document doc = new Document(@"Test001\in.docx");
    doc.Range.Replace(new Regex(@"\<(?.*?)/\>"), new ReplaceEvaluatorFindAndInsertMergefield(), false);
    doc.Save(@"Test001\out.doc");
}
private class ReplaceEvaluatorFindAndInsertMergefield : IReplacingCallback
{
    /// 
    /// This method is called by the Aspose.Words find and replace engine for each match.
    /// This method highlights the match string, even if it spans multiple runs.
    /// 
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;
        // The first (and may be the only) run can contain text before the match, 
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);
        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();
        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
        (remainingLength > 0) &&
        (currentNode != null) &&
        (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;
            // Select the next Run node. 
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }
        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }
        // Create Document Buidler aond insert MergeField
        DocumentBuilder builder = new DocumentBuilder(e.MatchNode.Document as Document);
        builder.MoveTo((Run)runs[runs.Count - 1]);
        string fieldName = e.Match.Groups["FieldName"].Value;
        builder.InsertField(string.Format("MERGEFIELD {0}", fieldName), string.Format("«{0}»", fieldName));
        // Now remove all runs in the sequence.
        foreach (Run run in runs)
            run.Remove();
        // Signal to the replace engine to do nothing because we have already done all what we wanted.
        return ReplaceAction.Skip;
    }
    /// 
    /// Splits text of the specified run into two runs.
    /// Inserts the new run just after the specified run.
    /// 
    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring(0, position);
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

Best regards,

nutan · November 1, 2010, 1:46am

Thanks Alexey,

I am already able to replace placeholder code with merge field successfully. My query was regarding formatting issue while replacing merge field with actual value.

My query was:
My source document contains placeholder codes that are replaced with merge fields at run time as:

_builder.MoveTo(currentNode);
_builder.InsertField(string.Format("MERGEFIELD {0}", mergeField), string.Format("«{0}»", mergeField));

But
when mail merge is run, merge fields are not replaced with same
formatting as was of placeholder codes in source document. For exp. In
attached document needs to be replaced with a value with same formatting (Green+Bold) but
what comes out in output is plain text with no formatting.
I read
the documentation, it seems I need to apply MERGEFORMAT switch while
setting up Merge Field at run time. Or Is there any other way to retain
the formatting?

Please suggest how to apply MERGEFORMAT switch at run-time so that i can apply same Font settings as that of placeholder code.

adam.skelton · November 1, 2010, 3:52am

Hi there,
Thanks for your inquiry.
As stated above you can use the MERGEFORMAT switch in your code like this:

_builder.InsertField(string.Format(@"MERGEFIELD {0} \* MERGEFORMAT",

Thanks,

nutan · November 1, 2010, 6:16am

Thanks Adam,

I had already tested merge fields formatting with the code you specified.

Field field = _parent._builder.InsertField(string.Format(@"MERGEFIELD {0} * MERGEFORMAT", mergeField));
//If marker node is a paragraph append all the children from the paragraph into the field result (before the field end).
// This should result in the merged data inheriting the formatting from the marker node.
if (currentNode.NodeType == NodeType.Paragraph)
{
    Paragraph para = (Paragraph)currentNode;
    foreach (Node node in para.ChildNodes.ToArray())
    {
        field.End.ParentParagraph.InsertBefore(node, field.End);
    }
}

else if (currentNode.NodeType == NodeType.Run)
{
    // If the current node is just a run insert it in the same place in order to produce the same behaviour.
    field.End.ParentParagraph.InsertBefore(currentNode, field.End);
}

But I am unable to retain formatting.
Instead the code you suggested resulted in spacing issue, white space between merge fields in source document is lost in output. I have attached In/Out files for your reference.

Please guide how can I have same formatting for values replaced for merge fields.

Thank you!

alexey.noskov · November 1, 2010, 10:12am

Hi Nutan,

Thank you for additional information. Have you tried using my method to replace placeholders with mergefields? As I can see after executing mail merge all values has the desired formatting. Here is my test code:

Document doc = new Document(@"Test001\in.docx");
doc.Range.Replace(new Regex(@"\<(?.*?)/\>"), new ReplaceEvaluatorFindAndInsertMergefield(), false);
// Execute mail merge (just for testing).
string[] names = doc.MailMerge.GetFieldNames();
doc.MailMerge.Execute(names, names);
doc.Save(@"Test001\out.doc");

ReplaceEvaluatorFindAndInsertMergefield you can find in my previous answer.
Best regards,

nutan · November 1, 2010, 12:49pm

Hello Alexey,

I was already using same code, but it results in partial formatting.
At some places it come out as correct formatting where as at others not.
Attached are IN and OUT doc. again for your reference. You will notice that and codes are not replaced with correct formatting.

Here is my code again, I matched it with yours, it looks right to me:

public ReplaceAction Replacing(ReplacingArgs e)
{
    var fieldName = e.Match.Groups["1"].Value;

    // This is a Run node that contains either the beginning or the complete match.
    currentNode = e.MatchNode;

    // The first (and may be the only) run can contain text before the match,
    // in this case it is necessary to split the run.
    if (e.MatchOffset > 0)
        currentNode = SplitRun((Run)currentNode, e.MatchOffset);

    // This array is used to store all nodes of the match for further removing.
    ArrayList runs = new ArrayList();

    // Find all runs that contain parts of the match string.
    int remainingLength = e.Match.Value.Length;
    while ((remainingLength > 0) &&
    (currentNode != null) &&
    (currentNode.GetText().Length <= remainingLength))
    {
        runs.Add(currentNode);
        remainingLength = remainingLength - currentNode.GetText().Length;

        // Select the next Run node.
        // Have to loop because there could be other nodes such as BookmarkStart etc.
        do
        {
            currentNode = currentNode.NextSibling;
        }
        while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
    }

    // Split the last run that contains the match if there is any text left.
    if ((currentNode != null) && (remainingLength > 0))
    {
        SplitRun((Run)currentNode, remainingLength);
        runs.Add(currentNode);
    }
    //Create Document builder and insert Merge Field
    _parent._builder = new DocumentBuilder((Document)e.MatchNode.Document);
    _parent._builder.MoveTo((Run)runs[runs.Count - 1]);

    fieldName = fieldName.ToUpper();
    if (fieldName == "PAGE/")
    {
        _parent._builder.InsertBreak(BreakType.PageBreak);
    }
    else
    {
        _parent._builder.InsertField(string.Format("MERGEFIELD {0}", fieldName), string.Format("«{0}»", fieldName));
    }

    // Now remove all runs in the sequence.
    foreach (Run run in runs)
        run.Remove();

    // Signal to the replace engine to do nothing because we have already done all what we wanted.
    return ReplaceAction.Skip;
}

Please guide if I am missing something.

Thank you!

adam.skelton · November 1, 2010, 5:08pm

Hi there,
Thanks for your inquiry.
This is happening because you are referencing the matched group incorrectly which results in a blank string.

string fieldName = e.Match.Groups["1"].Value;

This should be string fieldName = e.Match.Groups[1].Value; or how Alexey has referenced it in his code.
Thanks,

nutan · November 2, 2010, 12:11am

still same issue

Also e.Match.Groups[1].Value and e.Match.Groups["1"].Value results in same value for me.

Thanks

adam.skelton · November 2, 2010, 12:37am

Hi there,
Thanks for this additional information.
Could you please attach the full code you are using? Both methods appear to be working on this side.
Thanks,

nutan · November 2, 2010, 2:19am

Hello Adam,

Good that you pointed out that my code is working fine on your side. So, I streamlined the issue and found that issue is not in replacing merge fields but while replacing them with actual value.

As the formatting issue is only for and , these two fields are being inserted using DocumentBuilder.InsertHTML() as:

var builder = new DocumentBuilder(e.Document);
builder.MoveToMergeField(e.DocumentFieldName);
if (e.FieldValue != null)
builder.InsertHtml((string)e.FieldValue);

// The HTML text itself should not be inserted. We have already inserted it as an HTML.
e.Text = string.Empty;

In my test scenario, because these two fields don’t have applied formatting, InsertHtml() output these fields as plain text.
Though there are different tricks available in C#, do we have any method in Aspose where I can make sure e.FieldValue is not HTML?

adam.skelton · November 2, 2010, 4:15am

Hi there,
Thanks for your inquiry.
If I understand correctly you need to detect if the FieldValue about to merged is an HTML string or not.
To do this you can load your FieldValue string into a new MemoryStream and pass this to the FileFormatUtil.DetectFileFormat method. This will then return a FileFormatInfo object which contains load details of this stream. You then check if the FileFormatInfo.LoadFormat is equal to LoadFormat.Html. This will provide you with whether the string is HTML or not. I’m not quite sure if it will work well for HTML snippets though, you will need to give it a test.
If this is not the case could you please clarify, prehaps with an example.
Thanks,

nutan · November 2, 2010, 4:41am

Thanks Adam,

Yes, you understood my query right.
I tested, FileFormatUtil.DetectFileFormat doesn’t work for html strings.

alexey.noskov · November 2, 2010, 7:53am

Hi Nutan,

Thanks for your inquiry. Such method is out of scope of Aspose.Words. If you need to determine whether the string contains HTML or not, you need to create your own method to do this.
The simplest method you can use is just checking if string contains ‘<’ or ‘>’ characters and if it contains then suppose that string is HTML.
If you would like to use DetectFileFormat, you can load your string into the stream and pass the stream as a parameter of DetectFileFormat method.
Best regards,