Hi,
I have a word document with a mapped custom XML with placeholder data, i trying to replace the text of the structureDocumentTags to a “new value” and then perform a removeSelfOnly() for side by side compare. While i am succesfull in replacing the SDT first child node text to the desired text when i perform a removeSelfOnly() of the sdt it ‘falls back’ to the placeholder text, instead of leaving the content as the documentation reads.
Is there a way to do the removeSelfOnly while keeping my modified text and also leaving the custom xml placeholder text unmodified.
Code:
public class AsposeDocumentChangesTest {
@SuppressWarnings("unchecked")
public static void main(String[] args) throws Exception {
Document doc1 = new Document("C:\\AsposeTest\\" + "baseDoc.docx");
//Document doc2 = new Document("C:\\AsposeTest\\" + "forCompareDoc.docx");
org.w3c.dom.Document doc1Xml = getDocumentXml(doc1);
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc1.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
// search for the value of this sdt in the placeholder xml file
String xPathString = sdt.getXmlMapping().getXPath().replaceAll(OfficeContractXmlConstants.PLACEHOLDER_NS_PREFIX+":", "")+"/text()";
XPathExpression expr = xpath.compile(xPathString);
String placeholderValueIfExists = (String) expr.evaluate(doc1Xml, XPathConstants.STRING);
placeholderValueIfExists = getNewTextForDates(sdt.getText(), sdt);
// update the value in the document
updateValue(sdt, placeholderValueIfExists);
}
doc1.save("C:\\AsposeTest\\" + "1. textReplacement.docx", SaveFormat.DOCX);
removeContentControls(doc1);
doc1.save("C:\\AsposeTest\\" + "2. removedContentControlsPlainTextForCompare.docx", SaveFormat.DOCX);
}
private static void updateValue(StructuredDocumentTag sdt, String value) {
// remove the text from the sdt and remember a reference node
CompositeNode<?> referenceNode = null;
for (Object node : sdt.getChildNodes(NodeType.RUN, true)) {
referenceNode = ((Run)node).getParentNode();
((Run)node).remove();
}
// add the updated value back to the reference node
if (referenceNode != null) {
Run updatedRun = new Run(sdt.getDocument(), value);
referenceNode.appendChild(updatedRun);
}
}
@SuppressWarnings("unchecked")
private static void removeContentControls(Document doc) {
for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
try {
sdt.removeSelfOnly(); // Removes just this SDT node itself, but keeps the content of it inside the document tree
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
private static String getNewTextForDates(String currentValue, StructuredDocumentTag sdt) throws Exception {
// get the placeholder definition as additional formatting may be needed
//boolean temp = sdt.isShowingPlaceholderText();
if (sdt.getSdtType() == SdtType.DATE) {
return "CONVERTED SOMEHOW"; // this would parse the recieved text not important right not.
}
return currentValue;
}
private static org.w3c.dom.Document getDocumentXml(Document doc) {
try {
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
doc.save(outStream, SaveFormat.DOCX);
return DocumentPlaceholderXmlExtractor.getPlaceholderXML(new ByteArrayInputStream(outStream.toByteArray()));
} catch (IOException | ParserConfigurationException | SAXException e ) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
}
This is how my docs look:
Before performing the removeSelfOnly()
After performing the removeSelfOnly() this text is comming from the related XML placeholder text.