Extracted text line break is showing unknown character in text

Hi,

I am extracting the content between the start and end of the bookmark. Once the extracted data is converted to the text, a line break is converted to an unknown character instead of “\n”. Because of it in UI, it shows a question mark instead of the enter. Please help me to figure it out.

Attaching the sample code WordTitleImport.zip (3.3 KB) and inputfile test (2).docx (19.0 KB)

Thank you

@Gptrnt In your case you can simply use Bookmark.getText() to extract content of the bookmark:

Document document = new Document("C:\\Temp\\in.docx");
int i = 1;
BookmarkCollection bookmarks = document.getRange().getBookmarks();
for (Bookmark bookmark : bookmarks)
{
    if (bookmark.getName().equals("title" + i))
    {
        String title = bookmark.getText();
        i++;
        System.out.println(title);
    }
}

If it is required to use extractContent, the i would suggest you to put the extracted content into a separate document and then convert to to text:

Document document = new Document("C:\\Temp\\in.docx");
int i = 1;
BookmarkCollection bookmarks = document.getRange().getBookmarks();
for (Bookmark bookmark : bookmarks)
{
    if (bookmark.getName().equals("title" + i))
    {
        i++;
        ArrayList<Node> nodes = ExtractContentHelper.extractContent(bookmark.getBookmarkStart(), bookmark.getBookmarkEnd(), false);
        Document subDoc = ExtractContentHelper.generateDocument(document, nodes);
        String title = subDoc.toString(SaveFormat.TEXT).trim();
        System.out.println(title);
    }
}
public static Document generateDocument(Document srcDoc, ArrayList<Node> nodes)
{
    // Clone source document to preserve source styles.
    Document dstDoc = (Document)srcDoc.deepClone(false);

    // Import each node from the list into the new document. Keep the original formatting of the node.
    NodeImporter importer = new NodeImporter(srcDoc, dstDoc, ImportFormatMode.USE_DESTINATION_STYLES);

    for (Node node : nodes)
    {
        if (node.getNodeType() == NodeType.SECTION)
        {
            Section srcSection = (Section)node;
            Section importedSection = (Section)importer.importNode(srcSection, false);
            importedSection.appendChild(importer.importNode(srcSection.getBody(), false));
            for (HeaderFooter hf : srcSection.getHeadersFooters())
            importedSection.getHeadersFooters().add(importer.importNode(hf, true));

            dstDoc.appendChild(importedSection);
        }
        else
        {
            Node importNode = importer.importNode(node, true);
            dstDoc.getLastSection().getBody().appendChild(importNode);
        }
    }

    return dstDoc;
}