How to retrieve the body section from the RTF string?

Hello,

How to retrieve the body section from the rtfString ?

@houmaid You can use the following simple code to load RTF string into Aspose.Words.Document:

private static Document LoadRtfString(string rtf)
{
    using (MemoryStream rtfStream = new MemoryStream(Encoding.UTF8.GetBytes(rtf)))
    {
        return new Document(rtfStream);
    }
}

Once it is loaded you can work with it as with regular document. If you need more assistance, please elaborate your requirements in more details.

Hello Alexey

So i have used the following code to convert an html image to RTF:

private static string GetRtfString(string html)
{
    string rtfString = "";
    // Create a document and insert HTML into it.
    DocumentBuilder builder = new DocumentBuilder();
    builder.InsertHtml(html);
    // Save document to stream as RTF and get RTF string.
    using (MemoryStream rtfStream = new MemoryStream())
    {
        builder.Document.Save(rtfStream, SaveFormat.Rtf);
        byte[] rtfBytes = rtfStream.ToArray();
        rtfString = Encoding.UTF8.GetString(rtfBytes);
    }
    return rtfString;
}

this rtfString is then merged into another RTF document.
However, this rtfString returned has different fonts than my main RTF doc.
I already tried to replace the font but it won’t work.

Is it possible to just retrieve the image from rtfString ?

@houmaid Why do not you directly insert HTML into your main document? Could you please elaborate your requirements in more details and provide you input documents, output and expected output? We will check and provide you more information or a solution.

i have tried the way you described.
I inserted the html into a docx now. So how do i retrieve the text inside the w:body ?

@houmaid If you need to extracts only text content, you can either simply save the document to TXT format, or use Node.ToString method. For example see the following code:

// If required you can specy additioanl TXT save options
TxtSaveOptions opt = new TxtSaveOptions();
// Ignore headers and footers.
opt.ExportHeadersFootersMode = TxtExportHeadersFootersMode.None;

// Extracts text content using ToString method.
string textContent = doc.ToString(opt);

Thanks
The HTML is added to the docx however the HTML contains a base 64 image and the docx gives an error .
How to add HTML base64 image to the docx?

@houmaid Aspose.Words supports images in base64 format in HTML. You can simply load such HTML into the Document object

string htmlWithBase64 = "<html>"
    + "<body style=\"font-family:'Times New Roman'; font-size:12pt\">"
        + "<div>"
            + "<p style=\"margin-top:0pt; margin-bottom:0pt\"><span>Hello, World!!!</span></p>"
            + "<p style=\"margin-top:0pt; margin-bottom:0pt\"><img src=\"\" /></p>"
        + "</div>"
    + "</body>"
    + "</html>";

// Load html string into the Document.
using (MemoryStream htmlStream = new MemoryStream(Encoding.UTF8.GetBytes(htmlWithBase64)))
{
    Document doc = new Document(htmlStream);
    doc.Save(@"C:\Temp\out.docx");
}

or insert HTML string using DocumentBuilder

string htmlWithBase64 = "<html>"
    + "<body style=\"font-family:'Times New Roman'; font-size:12pt\">"
        + "<div>"
            + "<p style=\"margin-top:0pt; margin-bottom:0pt\"><span>Hello, World!!!</span></p>"
            + "<p style=\"margin-top:0pt; margin-bottom:0pt\"><img src=\"\" /></p>"
        + "</div>"
    + "</body>"
    + "</html>";

// Insert HTML using DocumentBuilder.
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.InsertHtml(htmlWithBase64);
doc.Save(@"C:\Temp\out.docx");