InsertDocument Changes Fonts of Resulting Document

Hi, Aspose Team,
I am inserting HTML snippets into an RTF template using the InsertDocument() code I found on this site. My doc could have:
“Some text in Arial 11 point… blah blah #address# More stuff in Arial 11”
The HTML is very simple:
123 main street
Then I replace #address# with the html.
After the replacement, the resulting RTF is all in Times Roman!
I will paste the code below.
Thanks in Advance!
Rich

public void InsertDocument(Node insertAfterNode, Document srcDoc, ParagraphAlignment alignment)
{
    // Make sure that the node is either a pargraph or table.
    if ((!insertAfterNode.NodeType.Equals(NodeType.Paragraph)) &
    (!insertAfterNode.NodeType.Equals(NodeType.Table)))
        throw new ArgumentException("The destination node should be either a paragraph or table.");
    // We will be inserting into the parent of the destination paragraph.
    CompositeNode dstStory = insertAfterNode.ParentNode;
    // This object will be translating styles and lists during the import.
    NodeImporter importer = new NodeImporter(srcDoc, insertAfterNode.Document,
    ImportFormatMode.UseDestinationStyles);
    // Loop through all sections in the source document.
    foreach (Section srcSection in srcDoc.Sections)
    {
        // Loop through all block level nodes (paragraphs and tables) in the body of the section.
        foreach (Node srcNode in srcSection.Body)
        {
            // Let’s skip the node if it is a last empty paragarph in a section.
            if (srcNode.NodeType.Equals(NodeType.Paragraph))
            {
                Paragraph para = (Paragraph)srcNode;
                if (para.IsEndOfSection && !para.HasChildNodes)
                    continue;
            }
            // This creates a clone of the node, suitable for insertion into the destination document.
            Node newNode = importer.ImportNode(srcNode, true);
            if (newNode.GetType() == typeof(Paragraph))
            {
                ((Paragraph)newNode).ParagraphFormat.Alignment = alignment;
            }
            // Insert new node after the reference node.
            dstStory.InsertAfter(newNode, insertAfterNode);
            insertAfterNode = newNode;
        }
    }
}

Hi
Thanks for your request. Could you please attach your documents here for testing? I will investigate the issue and provide you more information.
Best regards,

Sure - here are the docs.
Thanks for your help!
Rich
PS One more thing… when you replace the upper field (#specialnote#) with the HTML doc, then the image in the RTF disappears. I suspect the issues are related, but I thought I’d point it out. Thanks again!

Hi
Thank you for additional information. I cannot reproduce the problem using the latest version of Aspose.Words (7.0.0). You can download this version from here:
https://releases.aspose.com/words/net
Best regards,

Hi, Andrey,
I grabbed the latest version. I can see that the fonts are behaving better, but the image in the doc still disappears. Do you see that also?
Also, my license is for an older version. I’m working with our procurement dept to get an update. How can I get a temp license in the meanwhile?
Thanks for all your help. I love the quick responses.
Rich

Hi
Thanks for your inquiry. I still cannot reproduce the problem with picture on my side. The picture is present in my output document.
You can request a 30-day Temporary License here:
https://purchase.aspose.com/temporary-license
Best regards,

Hi, Andrey,
Please try with this new template. The issue is when the image is on the same line as the replaceable field.
I can repro this every time.
Thanks!
Rich

Hi Rich,
Thank you for additional information. But, I still cannot reproduce the problem on my side. I suppose, you use ReplaceEvaluator to replace placeholder with HTML. Could you please provide me code of your ReplaceEvaluator? I will check the issue one more time and provide you more information.
Best regards,

Hi,
Here is my theory… if the string in the HTML is long enough, it will overwrite the image in the RTF. Please try editing the HTML and making the string in the Body tag longer - say 100 char or more.
Thanks for your help!
Rich
Here is my handler:

private ReplaceAction InsertDocumentAtReplaceHandler(object sender, ReplaceEvaluatorArgs e)
{
    string tempPath = @"c:\temp";
    if (Directory.Exists(tempPath) == false)
    {
        Directory.CreateDirectory(tempPath);
    }
    string guidrtffilename = tempPath + Guid.NewGuid() + ".rtf";
    string guidhtmlfilename = tempPath + Guid.NewGuid() + ".html";
    string htmlagendatxt = "";
    try
    {
        Paragraph para = (Paragraph)e.MatchNode.ParentNode;
        if (e.Match.Value == "#specialnote#")
        {
            string meetingnote = string.IsNullOrEmpty(Meeting.MeetingNote) ? "" :
            Meeting.MeetingNote;
            htmlagendatxt = "";
            htmlagendatxt += meetingnote;
            htmlagendatxt += "";
        }
        else if (e.Match.Value == "#additionalnote#")
        {
            string reason = string.IsNullOrEmpty(Meeting.MeetingReason) ? "" :
            Meeting.MeetingReason;
            htmlagendatxt = "";
            htmlagendatxt += reason;
            htmlagendatxt += "";
        }
        else if (e.Match.Value == "#postinginfo#")
        {
            string postinginfo = string.IsNullOrEmpty(Meeting.MeetingPostingInfo) ? "" :
            Meeting.MeetingPostingInfo;
            string htmlpostinginfo = Regex.Replace(postinginfo, "\r\n", "
            ");
            string paraTest = para.ToTxt();
            paraTest = Regex.Replace(paraTest, e.Match.Value, htmlpostinginfo);
            htmlagendatxt = "";
            htmlagendatxt += paraTest;
            htmlagendatxt += "";
        }
        else if (e.Match.Value == "#meetinglocation#")
        {
            string meetinglocation = string.IsNullOrEmpty(Meeting.MeetingLocation) ? "" :
            Meeting.MeetingLocation;
            string paraTest = para.ToTxt();
            string htmlmeetinglocation = Regex.Replace(meetinglocation, "\r\n", "
            ");
            paraTest = Regex.Replace(paraTest, e.Match.Value, htmlmeetinglocation);
            htmlagendatxt = "";
            htmlagendatxt += paraTest;
            htmlagendatxt += "";
        }
        File.WriteAllText(guidhtmlfilename, htmlagendatxt);
        Document htmldoc = new Document(guidhtmlfilename);
        Document rtffromHtml = new Document();
        DocumentBuilder builder = new DocumentBuilder(rtffromHtml);
        builder.InsertHtml(htmlagendatxt);
        htmldoc.Save(guidrtffilename, SaveFormat.Rtf);
        // Insert a document after the paragraph, containing the match text.
        // Paragraph para = (Paragraph)e.MatchNode.ParentNode;
        // string txt = para.ToTxt();
        // string nother = e.Replacement;
        InsertDocument(para, htmldoc, para.ParagraphFormat.Alignment);
        RunCollection runs = para.Runs;
        // Remove the paragraph with the match text.
        para.Remove();
    }
    finally
    {
        if (File.Exists(guidrtffilename))
        {
            File.Delete(guidrtffilename);
        }
        if (File.Exists(guidhtmlfilename))
        {
            File.Delete(guidhtmlfilename);
        }
    }
    return ReplaceAction.Skip;
}

Hi
Thank you for additional information. The problem occurs because you remove paragraph, which contains this image.
Here is snippet of your code:

InsertDocument(para, htmldoc, para.ParagraphFormat.Alignment);
RunCollection runs = para.Runs;
// Remove the paragraph with the match text.
para.Remove();

Also, I do not fully understand, why you use InsertDocuemnt method here. I think, you can use just InsertHtml method. You can find an example here:
https://reference.aspose.com/words/net/aspose.words/range/replace/
Best regards,

Thanks! I will try that approach.

Hi,
While I was working through this, I found an issue with the version 7 library that does not exist in the older version (5.x) that we were using.
I create the HTML file (input_html.txt in this upload. Please rename it to have an HTML extension.)
I run it thru this code:

Document htmldoc = new Document(guidhtmlfilename);
Document rtffromHtml = new Document();
DocumentBuilder builder = new DocumentBuilder(rtffromHtml);
htmldoc.Save(guidrtffilename, SaveFormat.Rtf);

The saved file is the rtf file that I uploaded.
Please notice how the columns on the right are not right aligned in the resulting RTF file like they are in the HTML file.
Do you have any suggestions to fix this?
Thanks!
Rich

Hi
Thank you for reporting this problem to us. I managed to reproduce it on my side. You will be notified as soon as it is resolved.
It seems the problem occurs because there are merged cells in your HTML document. As a workaround, you can try using the following code:

Document doc = new Document(@"Test001\input_html.htm");
RemoveMergedCells(doc);
doc.Save(@"Test001\out.rtf");

================================================================

private static void RemoveMergedCells(Document doc)
{
    // Remove horizontally merged cells.
    // Get collection of cells in the docuemnt.
    NodeCollection cells = doc.GetChildNodes(NodeType.Cell, true);
    ArrayList cellToRemove = new ArrayList();
    foreach (Cell cell in cells)
    {
        // Check whether cell is merged with previouse.
        if (cell.CellFormat.HorizontalMerge == CellMerge.Previous)
        {
            Cell prevCell = cell.PreviousSibling as Cell;
            if (prevCell != null)
            {
                prevCell.CellFormat.Width += cell.CellFormat.Width;
                cellToRemove.Add(cell);
            }
        }
    }
    foreach (Cell cell in cellToRemove)
        cell.Remove();
}

Hope this helps.
Best regards.

Hi,
Thanks. The type Cell is not resolving for me. What do I need to include (other than Aspose.Words)?
I already have
using Aspose.Words;
Thanks!
Rich

Hi
Thanks for your inquiry. You should add
using Aspose.Words.Tables;
Best regards,

Thanks! This looks like it’s working.
Rich

Hi!
I’m sorry to resurrect an older thread, but I found a similar issue with the new DLL. This might be related to the FormatMergedCells() method you kindly provided.
Using the code I provided in the post, when I send the attached html file (called agendaCorrectFormat.htm.txt). The output RTF file shows the table with the columns not looking correct.
Can you suggest a fix or workaround?
Thanks in advance!
Rich

Hi Rich,
Thanks for your inquiry. I see the problem in your document. However, unfortunately, I cannot suggest any workaround to fix the issue. The problem is the same – RTF does not handle merged cells as expected.
Best regards.

Hi,
Thanks for your reply. This might not seem like a big issue, but to my customer it’s huge. Is there any chance for a hot-fix? We might have to delay our launch because of this issue.
Thanks!
Rich

Hi Rich,
Thanks for your request. I will notify you as soon as the original issue is resolved. Maybe as a workaround, you can try using DOC format instead of RTF. When save the original HTML to DOC, the problem does not occur.
Best regards.