Extract Rich Text from Document

Whats the best way to extract a section of RichText from the Nodes of a Document? If thats not possible, is there a good workaround? Thanks.

If RTF is too hard then is it possible to do that as HTML instead?


Thanks for your inquiry. There is no direct way to achieve this. However, you can copy nodes, which are needed to be converted to RTF, to separate document and then convert this document to RTF or HTML.
Please let me know if you need more assistance, I will be glad to help you.
Best regards,

Could you provide more details. Sorry I’m still not sure how to do this. Thanks


Thanks for your inquiry. Let’s suppose that you would like to get RTF representation of table in your document. To achieve this you should follow the steps below:

  1. Open a document, which contains this table.
  2. Create an empty document.
  3. Copy the table to the empty document.
  4. Save the temporary document in RTF format.
  5. (optional) Get RTF string.

Here is code, which demonstrates the main idea.

// Open source document.
Document srcDoc = new Document("C:\\Temp\\in.doc");
// Create emporary document.
Document tmpDoc = new Document();
// Get node, which should be converted to RTF.
Node srcNode = srcDoc.getChild(NodeType.TABLE, 0, true);
// Import node into the temporary document.
Node tmpNode = tmpDoc.importNode(srcNode, true, ImportFormatMode.KEEP_SOURCE_FORMATTING);
// Insert this node into the temporary document.
// Save the temporary document as RTF (here I just get RTF string).
String rtfString = getRtfString(tmpDoc);


private static String getRtfString(Document doc) throws Exception
    // Save document to stream as RTF (you can save it also in HTML for example).
    ByteArrayOutputStream rtfStream = new ByteArrayOutputStream();
    doc.save(rtfStream, SaveFormat.RTF);
    // Get string from stream.
    return rtfStream.toString();

Hope this helps.
Best regards.

This works great. Thanks!