Hi,
We are currently using Aspose Words Java version 15.11.0 in our application.
We have a requirement where we need to convert the Rich Text content controls in a Word document into Plain Text content controls. Basically the requirement is to retrieve all the StructuredDocumentTag objects of type SdtType.RICH_TEXT from the Word document and covert them to StructuredDocumentTag objects of type SdtType.PLAIN_TEXT.
Basically when converting Rich Text StructuredDocumentTag into Plain Text StructuredDocumentTag, we just need to retain the text portion of the content. Because Plain Text StructuredDocumentTag can only contain text data (that i.e. only Run nodes) and it can’t contain complex objects like field codes, images etc.
# Query 1
Do we have any way to take a Rich Text content control in Word document and convert it into Plain Text content control (StructuredDocumentTag)?
We just need to retain the plain text portion of the content during the conversion from Rich Text into Plain Text content control.
If possible if we can retain the formatting of the first character or first run node in the Rich Text content control into Plain Text content control., that will be nice…
In case if you can provide a sample code that will be nice…
# Query 2
Please refer to the attached “RichTextControls.docx”. This document has 7 rich text content controls in different places inside different kinds composite nodes…
For example, one rich text content control is there inside the Heading directly under the “w:body” tag,
some rich text content control is there inside paragraph text inside the “w:p” tag (paragraph tag),
some rich text content control is inside the table cells inside the “w:tr” tag…
The provided code sample should be able to convert all the Rich Text content controls in this document “RichTextControls.docx” to the Plain Text content controls.
Appreciate your help.
-Satya