Issue with Bulleted list on Word to PDF generation

ponnacp · October 5, 2012, 4:38pm

Hi -
We are using Aspose.Words 11.0 and Aspose.Pdf.Kit 6.0 in our application. Tested with Aspose.Words 11.8 too.
We have
two applications which use Aspose to generate PDF documents for customers. Both of these applications use an editor we developed in house that is just a wrapper around the Microsoft rich text box control that is used in WordPad. Our application takes and RTF string created by this editor from out database at document generation time and passes it to Aspose to render as PDF.

The problem we have discovered is that Aspose does not render bulleted text correctly when that text was created in either our editor or WordPad. It includes the bullets with text bulleted in Word, but the indentation is still off. The RTF being generated by both our editor and WordPad appears to be valid per the RTF specification, but Aspose does not seem to recognize it (but both Word and WordPad do). Although does render bullets with the Word-generated RTF, the text is still not formatted correctly. It appears that the WordPad/custom tool RTF emits the backward-compatible RTF outlined in the spec, but Word is doing its own thing.A sample of the RTF described can be found in Test.rtf.

A zip file is attached which includes the RTF sample file(Test.rtf), and various screen shots which are self explanatory.
We would really like to be able to generate PDF that matches the appearance of the text in the applications. Any assistance you can provide would be greatly appreciated.

Thanks!

tahir.manzoor · October 9, 2012, 1:49am

Hi Ponnamma,

Please accept my apologies for late response.

Thanks for your inquiry. I have tested the scenario and have not fount the indentation issue while using latest version of Aspose.Words for .NET. Please use the latest version of Aspose.Words for .NET. I have attached the output PDF file with this post.

Please let us know if you have any more queries.

ponnacp · October 10, 2012, 11:15am

Hi:
Thanks for your response. Yes, its works perfect when you just load the rtf file and save it as a pdf. But, it doesn’t work in our scenario where we’re using the NodeImporter to preserve the formatting. Sorry for not being clear. Here are the details:
We have a word (docx) template with Mergefields. Some of them are replaced by rich text strings which are retrieved from the database. We use the NodeImporter to preserve/translate the formatting of the richtext including the bulleted list.
I have attached our code file, the template with a mergefield and the output file which was creating using latest Aspose.Words.dll 11.8 version. You’ll have to change the file path in the code.
Your response will be much appreciated. Thanks!

tahir.manzoor · October 11, 2012, 3:41am

Hi Ponnamma,

Thanks for sharing the more information. I have modified your code. Please use following modified code for your requirement. I have attached the output PDF file with this post. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

Document doc = new Document(MyDir + "MergeField_Richtxt.docx");
doc.JoinRunsWithSameFormatting();
DocumentBuilder builder = new DocumentBuilder(doc);
string replacer = "CUSTOM_CONTINUED_OCCUPANY_REQ";
Node[] nodes = GetMergeValue(doc);
builder.MoveToMergeField(replacer);
// foreach (Node rValue in nodes)
// {
// if (rValue is Paragraph)
// {
// Paragraph p = rValue as Paragraph;
////insert a ParagraphBreak only if the node IS NOT the first node. This will avoid the empty space before the text.
// if (rValue != nodes[0])
// {
//// Fix for paragraph 6 spacing issues.
////builder.InsertBreak(BreakType.ParagraphBreak);
// builder.InsertBreak(BreakType.LineBreak);
// }
// Node[] runs = p.GetChildNodes(NodeType.Run, true, false).ToArray();
// foreach (Run r in runs)
// {
// builder.InsertNode(r);
// }
// }
// else if (rValue is Run)
// {
// builder.InsertNode(rValue);
// }
// }
Document dstDoc = GenerateDocument(doc, nodes);
dstDoc.Save(MyDir + "AsposeOut.pdf", Aspose.Words.SaveFormat.Pdf);
public static Document GenerateDocument(Document srcDoc, Node[] nodes)
{
    // Create a blank document.
    Document dstDoc = new Document();
    // Remove the first paragraph from the empty document.
    dstDoc.FirstSection.Body.RemoveAllChildren();
    // Import each node from the list into the new document. Keep the original formatting of the node.
    NodeImporter importer = new NodeImporter(srcDoc, dstDoc, ImportFormatMode.KeepSourceFormatting);
    foreach(Node node in nodes)
    {
        Node importNode = importer.ImportNode(node, true);
        dstDoc.FirstSection.Body.AppendChild(importNode);
    }
    // Return the generated document.
    return dstDoc;
}

ponnacp · October 11, 2012, 9:35am

Hi -
Thanks again for your quick response. I guess the template (MergeField_Richtxt.docx) I sent you was not a good reproduction of our actual one - very sorry! Our actual template has a lot of static text and dynamic text (which uses MergeFields). Our application handles many such templates and the MergeFields could be Rich Text or plain text which are handled accordingly. (the code I sent you is more or less the same one that we use in our application).
So, your solution of creating a new document and using the NodeImporter doesn’t solve the problem in our situation.
I have attached another template(Template_MergeField_Richtxt.docx) which is close to what our actual ones would look like.
Thanks!!

tahir.manzoor · October 12, 2012, 7:30am

Hi Ponnamma,

Thanks for sharing the details. Please use the following code snippet for your requirements. I suggest you, please read following documentation link for your kind reference. I have attached the output PDF file with this post. Let me know if you have any more queries.

https://docs.aspose.com/words/java/insert-and-append-documents/

Document doc = new Document(MyDir + "Template_MergeField_Richtxt.docx");

doc.JoinRunsWithSameFormatting();
DocumentBuilder builder = new DocumentBuilder(doc);
string replacer = "CUSTOM_CONTINUED_OCCUPANY_REQ";
Node[] nodes = GetMergeValue(doc);
builder.MoveToMergeField(replacer);
Document dstDoc = GenerateDocument(doc, nodes);
// Please Pass Paragraph Node where you insert your RTF string
InsertDocument(builder.CurrentParagraph, dstDoc);
doc.Save(MyDir + "AsposeOut.pdf", Aspose.Words.SaveFormat.Pdf);
public void InsertDocument(Node insertAfterNode, Document srcDoc)
{
    // Make sure that the node is either a paragraph or table.
    if ((!insertAfterNode.NodeType.Equals(NodeType.Paragraph)) &
        (!insertAfterNode.NodeType.Equals(NodeType.Table)))
        throw new ArgumentException("The destination node should be either a paragraph or table.");
    // We will be inserting into the parent of the destination paragraph.
    CompositeNode dstStory = insertAfterNode.ParentNode;
    // This object will be translating styles and lists during the import.
    NodeImporter importer = new NodeImporter(srcDoc, insertAfterNode.Document, ImportFormatMode.KeepSourceFormatting);
    // Loop through all sections in the source document.
    foreach(Section srcSection in srcDoc.Sections)
    {
        // Loop through all block level nodes (paragraphs and tables) in the body of the section.
        foreach(Node srcNode in srcSection.Body)
        {
            // Let's skip the node if it is a last empty paragraph in a section.
            if (srcNode.NodeType.Equals(NodeType.Paragraph))
            {
                Paragraph para = (Paragraph) srcNode;
                if (para.IsEndOfSection && !para.HasChildNodes)
                    continue;
            }
            // This creates a clone of the node, suitable for insertion into the destination document.
            Node newNode = importer.ImportNode(srcNode, true);
            // Insert new node after the reference node.
            dstStory.InsertAfter(newNode, insertAfterNode);
            insertAfterNode = newNode;
        }
    }
}