How to Merge Multiple RTF Chapter (content) blocks into a PDF and build a TOC

Hi Nitin,

Thanks for your queries. You are using an old version of Aspose.Words. Please note that every new release of Aspose.Words comes with some new features, enhancements in the existing features, and bug fixes.

Please use the latest version of Aspose.Words for .NET.

The code shared in my previous post is a sample code for your Object Model. You can get an idea from the following code snippet for your Object Model. As per your queries, you want to do the following:

  • Insert Table of Contents in the document
  • Set the styles of the Table of Contents
  • How to insert an image into the document and set its size
  • How to insert RTF contents into your document

You can get an idea from the following code snippet and modify it as per your Object Model.

Sample Code:

Document FinalDoc = new Document();
DocumentBuilder builder = new DocumentBuilder(FinalDoc);

builder.MoveTo(FinalDoc.FirstSection.Body.FirstParagraph);

// Object 1: File path to an image (.jpg) file - This must
// appear on the FIRST page and resize and fit the page.
// forced page break here
builder.MoveToParagraph(0, 0);

// Read image from file to Image object
Image img = Image.FromFile("d:\\Chrysanthemum.jpg");
double targetHeight;
double targetWidth;
CalculateImageSize(builder, img, out targetHeight, out targetWidth);

// Add the image to the document and set its position and size
Shape shp = builder.InsertImage(img, targetWidth, targetHeight);
shp.WrapType = WrapType.Inline;

// Object 2: RTF Content (Title and some Copyright information)
Document docChapter1Title = RtfStringToDocument("chapter 1 title RTF String");
Paragraph paragraph = builder.InsertParagraph();
InsertDocument(paragraph, docChapter1Title);
builder.InsertBreak(BreakType.PageBreak);

// Object 3: RTF Content (This is the title of Chapter 1 so that it will appear in the TOC)
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Heading1;
builder.Font.Color = Color.Blue;
builder.Writeln("Table of Contents ");
builder.InsertTableOfContents("\\o \"1-3\" \\h \\z \\u");
builder.Writeln("");

// Your Code...

// Object 4: RTF Content - This is the RTF content (contains text and images) that belongs to the above chapter1
Document Object4Doc = RtfStringToDocument("Object 4: RTF Content - This is the RTF content (contains text and images) that belongs to the above chapter1");
paragraph = builder.InsertParagraph();
InsertDocument(paragraph, Object4Doc);

// Your Code...

// Call the UpdateFields method before saving the document
FinalDoc.UpdateFields();
FinalDoc.Save(MyDir + "AsposeOut.doc", SaveFormat.Doc);

/// <summary>
/// Calculates size of Image

Process saves the image according to the screen height and width of the document You would want to save the image such that it fits on the page
/// </summary>
/// <param name="builder">DocumentBuilder is used to determine Height and Width of current Page</param>
/// <param name="img">Original image</param>
/// <param name="targetHeight">Height of the image</param>
/// <param name="targetWidth">Width of the image</param>
private void CalculateImageSize(DocumentBuilder builder, Image img, out double targetHeight, out double targetWidth)
{
    // Calculate width and height of the page
    PageSetup ps = builder.CurrentSection.PageSetup;
    targetHeight = ps.PageHeight - ps.TopMargin - ps.BottomMargin;
    targetWidth = ps.PageWidth - ps.LeftMargin - ps.RightMargin;

    // Get size of an image
    double imgHeight = ConvertUtil.PixelToPoint(img.Height);
    double imgWidth = ConvertUtil.PixelToPoint(img.Width);

    if (imgHeight < targetHeight && imgWidth < targetWidth)
    {
        targetHeight = imgHeight;
        targetWidth = imgWidth;
    }
    else
    {
        // Calculate size of an image in the document
        double ratioWidth = imgWidth / targetWidth;
        double ratioHeight = imgHeight / targetHeight;
        if (ratioWidth > ratioHeight)
        {
            targetHeight = (targetHeight * (ratioHeight / ratioWidth));
        }
        else
        {
            targetHeight = (targetWidth * (ratioWidth / ratioHeight));
        }
    }
}

Please read the following forum link for your kind reference:
https://forum.aspose.com/t/52359

Hi Tahir,
The code is still not producing the correct .doc file. Please see the attached doc files.
lion-Generated-By-Aspose.doc - created by aspose word
Lion-Good-Final.doc - Manually created by me. This is what i want the final .doc to look like.

here’s the C# code method I am using:

Document FinalDoc = new Document();
DocumentBuilder builder = new DocumentBuilder(FinalDoc);

builder.MoveTo(FinalDoc.FirstSection.Body.FirstParagraph);

builder.MoveToParagraph(0, 0);

// =============================================
// COVER PAGE
// =============================================
// Read image from file to Image object
Image img = Image.FromFile(@_EBookMaker.CoverPage.CoverPageImageFileName);

double targetHeight;
double targetWidth;
AsposeHelperManager.CalculateImageSize(builder, img, out targetHeight, out targetWidth);

//Add the image to the document and set it’s position and size
Shape shp = builder.InsertImage(img, targetWidth, targetHeight);
shp.WrapType = WrapType.Inline;
builder.InsertBreak(BreakType.PageBreak);


// =============================================
// TITLE PAGE AND COPYRIGHT INFO
// =============================================
//Object 2: RTF Content (Title and some Copyright information)
Document docTitlePage = AsposeHelperManager.RtfStringToDocument(@_EBookMaker.TitlePage.RtfContent);//“chapter 1 title RTF STring”);
Paragraph paragraph = builder.InsertParagraph();
AsposeHelperManager.InsertDocument(paragraph, docTitlePage);
builder.InsertBreak(BreakType.PageBreak);


// =============================================
// TOC
// =============================================
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Heading1;
builder.Font.Color = Color.Blue;
builder.Writeln(“Table of Contents “);
builder.InsertTableOfContents(”\o “1-3” \h \z \u”);
builder.InsertBreak(BreakType.PageBreak);


// =============================================
// NOW PROCESS ALL CHAPTERS …
// =============================================
foreach (EBookSectionChapter chapter in _EBookMaker.ChapterList)
{
// Chapter Title
Document docChapterTitle = AsposeHelperManager.RtfStringToDocument(chapter.TitleAsRtf);
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Heading1;
builder.Font.Color = Color.Black;
Paragraph ChapterTitleParagraph = builder.InsertParagraph();
AsposeHelperManager.InsertDocument(ChapterTitleParagraph, docChapterTitle);

// Chapter Content
Document docChapterContent = AsposeHelperManager.RtfStringToDocument(chapter.RtfContent);
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Normal;
builder.Font.Color = Color.Black;
Paragraph ChapterContentParagraph = builder.InsertParagraph();
AsposeHelperManager.InsertDocument(ChapterContentParagraph, docChapterContent);

builder.Writeln("");

// =============================================
// PROCESS ALL SECTIONS …
// =============================================
foreach (EBookSectionChapterSection section in chapter.EBookSectionChapterSectionList)
{
// Section Title
Document docSectionTitle = AsposeHelperManager.RtfStringToDocument(chapter.TitleAsRtf);
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Heading2;
builder.Font.Color = Color.Black;
Paragraph SectionTitleParagraph = builder.InsertParagraph();
AsposeHelperManager.InsertDocument(SectionTitleParagraph, docSectionTitle);

// Section Content
Document docSectionContent = AsposeHelperManager.RtfStringToDocument(chapter.RtfContent);
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Normal;
builder.Font.Color = Color.Black;
Paragraph SectionContentParagraph = builder.InsertParagraph();
AsposeHelperManager.InsertDocument(SectionContentParagraph, docSectionContent);

builder.Writeln("");
}
}

// =============================================
// REFRESH THE TOC …
// =============================================
FinalDoc.UpdateFields();

FinalDoc.Save(OutputFilePath, SaveFormat.Doc);

Hi Nitin,

Thanks for sharing the document. I have modified the code according to the shared document. Please find the output document in the attachment. You can use the same code in your object model. Hope this helps you. Please let us know if you have any more queries.

Document FinalDoc = new Document();
DocumentBuilder builder = new DocumentBuilder(FinalDoc);

Style Heading1 = FinalDoc.Styles[StyleIdentifier.Heading1];
Style Heading2 = FinalDoc.Styles[StyleIdentifier.Heading2];

Heading1.Font.Name = "Cambria";
Heading1.Font.Size = 14;
Heading1.Font.Bold = true;
Heading1.Font.Color = Color.DarkBlue;

Heading2.Font.Name = "Cambria";
Heading2.Font.Size = 13;
Heading2.Font.Bold = true;
Heading2.Font.Color = Color.DarkBlue;

builder.MoveTo(FinalDoc.FirstSection.Body.FirstParagraph);
builder.MoveToParagraph(0, 0);
builder.ParagraphFormat.Alignment = ParagraphAlignment.Center;

// =============================
//  COVER PAGE
// =============================
// Read image from file to Image object
Image img = Image.FromFile(MyDir + "Lion.png");

// Add the image to the document and set its position and size
Shape shp = builder.InsertImage(img);
shp.WrapType = WrapType.Inline;

// builder.InsertBreak(BreakType.PageBreak);

Paragraph para = builder.InsertParagraph();
para.AppendChild(new Run(FinalDoc, ""));

// =============================
//  TITLE PAGE AND COPYRIGHT INFO
// =============================
// Object 2: RTF Content  (Title and some Copyright information)
// Document docTitlePage = AsposeHelperManager.RtfStringToDocument(@_EBookMaker.TitlePage.RtfContent);//"chapter 1 title RTF STring");

String str = @"This is the Title and Copyright Page.
This should be right after the cover image page ";

"Copyright © 2012 by Joe Blog. All rights reserved worldwide. part of
this publication may be replicated, redistributed, or given away in any
form without the prior written consent of the author/publisher or the terms
relayed to you herein.";
"Joe Blog, Book Wizard Group,
1023 King Street, Toronto, M6W 2K5, Canada
www.wizardGroup.com";

Document docTitlePage = RtfStringToDocument(str);//"chapter 1 title RTF STring";

foreach (Paragraph para2 in docTitlePage.GetChildNodes(NodeType.Paragraph, true))
{
    para2.ParagraphFormat.Alignment = ParagraphAlignment.Center;
}

InsertDocument(para, docTitlePage);
builder.MoveTo(para);
builder.InsertBreak(BreakType.PageBreak);

// =============================
//  TOC
// =============================
builder.MoveToDocumentEnd();
builder.InsertBreak(BreakType.PageBreak);
builder.ParagraphFormat.ClearFormatting();
builder.Font.Color = Color.DarkBlue;
builder.Font.Name = "Cambria";
builder.Font.Size = 14;
builder.Writeln("Contents ");

builder.Font.Color = Color.Black;
builder.Font.Name = "Calibri";
builder.Font.Size = 11;

builder.InsertTableOfContents("\\o \"1-3\" \\h \\z \\u");
builder.InsertBreak(BreakType.PageBreak);

builder.Font.ClearFormatting();
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Heading1;
builder.Writeln("Chapter 1");

shp = builder.InsertImage(img);
shp.WrapType = WrapType.Inline;
builder.Writeln("");

str = @"This is Chapter 1 content...
Africa's lions of the Serengeti Park. This is an amazing peek into the lives of a tribe of lions that live and hunt on the vast plains of the Serengeti National Park.";
builder.ParagraphFormat.ClearFormatting();
builder.Font.Name = "Arial";
builder.Font.Size = 12;
builder.Font.Color = Color.Black;
builder.Font.Bold = false;
builder.ParagraphFormat.Alignment = ParagraphAlignment.Left;
builder.Writeln(str);

builder.Font.ClearFormatting();
builder.ParagraphFormat.StyleIdentifier = StyleIdentifier.Heading2;
builder.Writeln("Introduction");
builder.Writeln("");

str = @"This is Section 1 of Chapter 1 content...
Lions are the only cats that live in groups, which are called prides. Prides are family units that may include up to three males, a dozen or so females, and their young. All of a pride's lionesses are related, and female cubs typically stay with the group as they age. Young males eventually leave and establish their own prides by taking over a group headed by another male.";
builder.ParagraphFormat.ClearFormatting();
builder.Font.Name = "Arial";
builder.Font.Size = 12;
builder.Font.Color = Color.Black;
builder.Font.Bold = false;
builder.ParagraphFormat.Alignment = ParagraphAlignment.Left;
builder.Writeln(str);

// Your code..................
// Your code..................
// Your code..................

FinalDoc.UpdateFields();

FinalDoc.Save(MyDir + "AsposeOut.doc", SaveFormat.Doc);

Hi Tahir,
Thanks for the code. Just one question though.
I do not know what is in the chapter content. SOme may have many images and text, and some may only have text. All the content however is held in RTF format.
So how would the code work then?


Hi Nitin,

Thanks for your query. You can use the same following code snippet to insert RTF contents into your document by using RtfStringToDocument and InsertDocument methods.

Document rtfDoc = RtfStringToDocument("RTF contents .....");

// You can also load RTF document by using Document class
// Document rtfDoc = new Document("in.rtf")

Please read following documentation links for your kind reference.

Hi Tahir,

Thank for the code. I kind of got it to work but not 100%. I have 3 more requirements.

1. My Chapter titles are held in my object as RTF - this is because the title can have font, style, justification on it. So how do I take that Title now (in RTF format) and convert that to a style of Heading1?

2. I have special tags inside my content like this: []

how do I convert this into a bookmark?

3. I also have a special tag which is a link to a bookmark

like this: [>jump to b1~b1<]

The format is: [>Display Text~Bookmark<]

How do i convert this to a link to a bookmark called "b1"

so the link in the final word doc would look like this: jump to b1

which links to the b1 bookmark

Hi Nitin,


Please accept my apologies for late response. I am working over your query and will get back to you shortly.

Hi Nitin,

Thanks for your queries.

nitin.mistry@bell.ca:

  1. My Chapter titles are held in my object as RTF - this is because the title can have font, style, justification on it. So how do I take that Title now (in RTF format) and convert that to a style of Heading1?

Please use the ParagraphFormat.Style property to set the style as Heading1.

Document rtfDoc = RtfStringToDocument("RTF Title contents .....");
Aspose.Words.Style h1style = rtfDoc.Styles[StyleIdentifier.Heading1];

// Loop through every run node.
foreach (Paragraph para in rtfDoc.GetChildNodes(NodeType.Paragraph, true)).
{
    para.ParagraphFormat.Style = h1style;
}

nitin.mistry@bell.ca:
2. I have special tags inside my content like this: <b1>

Please read the following documentation links for your reference. You can use IReplacingCallback Interface, find such special tags and insert a bookmark at that position.

nitin.mistry@bell.ca:
3. I also have a special tag which is a link to a bookmark

like this: [>jump to b1~b1]

The format is: [>Display Text~Bookmark]

Please check my reply at the following forum link.

Hi Tahir,

I saw the code at that link, but that code is for external links.
These are internal bookmarks and links to bookmark.

So this is what i want to do.
I have a WORD document with the following my special custom tags as follows

[] - this represents a bookmark ‘b1’

[>jump~b1<] - this represents a link to the b1 bookmark (where ‘jump’ is the link’s display text.

How do i convert these tags into proper bookmarks and link to bookmarks.
I have many in a typical word document.

Thanks





Hi Nitin,

Please accept my apologies for late response.

Thanks for your inquiry. DocumentBuilder.InsertHyperlink method inserts a hyperlink into the document. Please use third parameter value as “true” as shown in following code snippet.

Third parameter : isBookmark True if the previous parameter is a name of a bookmark inside the document; false is the previous parameter is a URL.

Please use following code snippet for your kind reference. I have attached the input and output documents with this post. Please let us know if you have any more queries.

Document doc = new Document(MyDir + “in.docx”);

DocumentBuilder builder = new DocumentBuilder(doc);

Regex regex = new Regex("\\[>", RegexOptions.IgnoreCase);

FindLinks obj = new FindLinks();

doc.Range.Replace(regex, obj, true);

String link = "";

Node endNode = null;

Node currentNode = null;

ArrayList removenodes = new ArrayList();

foreach (Run run in obj.nodes)

{

currentNode = run;

link += currentNode.Range.Text;

removenodes.Add(currentNode);

while (!currentNode.Range.Text.Contains("<]"))

{

currentNode = currentNode.NextPreOrder(doc);

removenodes.Add(currentNode);

link += currentNode.Range.Text;

}

String[] LinkNode = link.Split(new Char[] { '~' });

link = "";

builder.MoveTo(run);

// Specify font formatting for the hyperlink.

builder.Font.Color = Color.Blue;

builder.Font.Underline = Underline.Single;

builder.InsertHyperlink(LinkNode[0].Replace("[>", ""), LinkNode[1].Replace("<]", ""), true);

// Revert to default formatting.

builder.Font.ClearFormatting();

}

foreach (Node node in removenodes)

{

node.Remove();

}

doc.Save(MyDir + "AsposeOut.docx");

///

/// This is called during a replace operation each time a match is found.

/// This method appends a number to the match string and returns it as a replacement string.

///

public class FindLinks : IReplacingCallback

{

//Store Matched nodes

public ArrayList nodes = new ArrayList();

ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)

{

// This is a Run node that contains either the beginning or the complete match.

Node currentNode = e.MatchNode;

nodes.Add(currentNode);

return ReplaceAction.Skip;

}

}