Extract Content Controls Data from RTF & Convert to Word (DOCX DOC) Document using Java | Custom Bookmarks

@Sudha_Mylapalli,

We will share our findings today. Stay tuned.

Yes, there should not be any problems when you will use the latest licensed version of Aspose.Words for Java i.e. 20.5 to convert ODT to DOCX on your end. You will also not see evaluation watermark string in generated document.

Please ZIP and upload your simplified input Word document containing the bar code here for testing. We will then investigate the scenario on our end and provide you more information.

@awais.hafeez

We are stopped on product issues because of this document conversion , can we call and share screen of all our issues/requirement once , so that you can easily understand and suggest for fixes.

We are ready to take license immediately if this document conversion works with customized bookmarks .

Please help us on this asap.

Thanks
Mahesh Palagani

@maheshpalagani,

Regarding the “input.docx” document, please try using the following Java code that prints the text of Content Controls with Titles V3:B or V3:G:

Document doc = new Document("E:\\Temp\\BookmarkSampleFile\\input.docx");

for (StructuredDocumentTag sdt :
        (Iterable<StructuredDocumentTag>) doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG,
                true)) {
    if (sdt.getTitle().equals("V3:B") || sdt.getTitle().equals("V3:G")) {
        System.out.println(sdt.toString(SaveFormat.TEXT).trim());
    }
}

@maheshpalagani,

Regarding the RTFSampleInput.rtf document, please try running the following code.

Document doc = new Document("E:\\Temp\\BookmarkSampleFile\\RTFSampleInput.rtf");

ReplaceHandler handler = new ReplaceHandler();
FindReplaceOptions opts = new FindReplaceOptions();
opts.setDirection(FindReplaceDirection.BACKWARD);
opts.setReplacingCallback(handler);

Pattern searchPattern = Pattern.compile("\\[BK:([^\\]]*)\\]", Pattern.CASE_INSENSITIVE);
for (Paragraph para : (Iterable<Paragraph>) doc.getChildNodes(NodeType.PARAGRAPH, true))
    para.getRange().replace(searchPattern, "", opts);

int i = 1;
for (String str : (Iterable<String>) handler.list)
    System.out.println(i++ + ". " + str); 

static class ReplaceHandler implements IReplacingCallback {
    public ArrayList list = new ArrayList();

    public int replacing(ReplacingArgs e) throws Exception {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.getMatchNode();

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.getMatchOffset() > 0)
            currentNode = splitRun((Run) currentNode, e.getMatchOffset());

        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.getMatch().group().length();
        while ((remainingLength > 0) && (currentNode != null) && (currentNode.getText().length() <= remainingLength)) {
            runs.add(currentNode);
            remainingLength = remainingLength - currentNode.getText().length();

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do {
                currentNode = currentNode.getNextSibling();
            } while ((currentNode != null) && (currentNode.getNodeType() != NodeType.RUN));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0)) {
            splitRun((Run) currentNode, remainingLength);
            runs.add(currentNode);
        }

        String value = e.getMatch().group(0).trim();
        // if (!list.contains(value))
        list.add(value);

        // Signal to the replace engine to do nothing because we have already done all what we wanted.
        return ReplaceAction.SKIP;
    }

    /**
     * Splits text of the specified run into two runs. Inserts the new run just
     * after the specified run.
     */
    private Run splitRun(Run run, int position) throws Exception {
        Run afterRun = (Run) run.deepClone(true);
        afterRun.setText(run.getText().substring(position));
        run.setText(run.getText().substring((0), (0) + (position)));
        run.getParentNode().insertAfter(afterRun, run);
        return afterRun;
    }
}

Hope, this helps in achieving what you are looking for.

@awais.hafeez
getting issue with aspose while converting from open office to word doc.
Please share a solution.
This document was truncated here because it was created using Aspose.Words in Evaluation Mode.

@Sudha_Mylapalli,

This happens because you are not applying Aspose.Words for Java license before creating Document instance. If you want to test Aspose.Words without the evaluation version limitations, then you can also request a 30-day Temporary License. Please refer to How to get a Temporary License?

A post was split to a new topic: Unable to get bookmarks in Word document

@awais.hafeez
Thanks for your response,still we are facing issues to convert bookmarks from open office to word document.
Adding some more detail documents for your understanding , Please go through this and let us know if you need any clarification from my end.

2020-06-03_12-14-31=>in this pic you can see the bookmarks when we do mouse hover at bookmark place in open office doc.

E018 - DEPP Delinquency Notice.odt=>This is source open office file which has bookmarks

Manuallyconverted_word_doc_E018 - DEPP Delinquency Notice.docx=>This is the document which we converted from open office to word manually (you can see our custom bookmarks with V3:B, V3:C).

Aspose_Converted_Test_E018 - DEPP Delinquency Notice.docx=>This is word document which ASPOSE jar converted from open office, Here Bookmarks came as normal text.
We want to automate this kind of conversion using aspose from open office to word.

sample_docx.zip (750.0 KB)

@Sudha_Mylapalli,

What desktop application did you use to generate “Manuallyconverted_word_doc_E018 - DEPP Delinquency Notice.docx” on your end - MS Word or OpenOffice Writer?

Please list the complete steps that you performed in MS Word (or OpenOffice Writer) to create this expected document (Manuallyconverted_word_doc_E018 - DEPP Delinquency Notice.docx). We will then provide you code that will perform the same steps programmatically by using Aspose.Words to get the desired output.