StructureDocumentTag removeSelfOnly removes set text and leaves placeholder text

Hi,

I have a word document with a mapped custom XML with placeholder data, i trying to replace the text of the structureDocumentTags to a “new value” and then perform a removeSelfOnly() for side by side compare. While i am succesfull in replacing the SDT first child node text to the desired text when i perform a removeSelfOnly() of the sdt it ‘falls back’ to the placeholder text, instead of leaving the content as the documentation reads.

Is there a way to do the removeSelfOnly while keeping my modified text and also leaving the custom xml placeholder text unmodified.

Code:

    public class AsposeDocumentChangesTest {

	@SuppressWarnings("unchecked")
	public static void main(String[] args) throws Exception {
		Document doc1 = new Document("C:\\AsposeTest\\" + "baseDoc.docx");
		//Document doc2 = new Document("C:\\AsposeTest\\" + "forCompareDoc.docx");

		org.w3c.dom.Document doc1Xml = getDocumentXml(doc1);

		XPathFactory xpathFactory = XPathFactory.newInstance();
        XPath xpath = xpathFactory.newXPath();

		for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc1.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
			// search for the value of this sdt in the placeholder xml file
			String xPathString = sdt.getXmlMapping().getXPath().replaceAll(OfficeContractXmlConstants.PLACEHOLDER_NS_PREFIX+":", "")+"/text()";
			XPathExpression expr = xpath.compile(xPathString);
			String placeholderValueIfExists = (String) expr.evaluate(doc1Xml, XPathConstants.STRING);


			placeholderValueIfExists = getNewTextForDates(sdt.getText(), sdt);

			// update the value in the document
			updateValue(sdt, placeholderValueIfExists);

		}

		doc1.save("C:\\AsposeTest\\" + "1. textReplacement.docx", SaveFormat.DOCX);


		removeContentControls(doc1);

		doc1.save("C:\\AsposeTest\\" + "2. removedContentControlsPlainTextForCompare.docx", SaveFormat.DOCX);
	}

	private static void updateValue(StructuredDocumentTag sdt, String value) {
		// remove the text from the sdt and remember a reference node
		CompositeNode<?> referenceNode = null;
		for (Object node : sdt.getChildNodes(NodeType.RUN, true)) {
			referenceNode = ((Run)node).getParentNode();
			((Run)node).remove();
		}
		// add the updated value back to the reference node
		if (referenceNode != null) {
			Run updatedRun = new Run(sdt.getDocument(), value);
			referenceNode.appendChild(updatedRun);
		}
	}

	@SuppressWarnings("unchecked")
	private static void removeContentControls(Document doc) {
		for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
			try {
				sdt.removeSelfOnly(); // Removes just this SDT node itself, but keeps the content of it inside the document tree
			} catch (Exception e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
	}

	private static String getNewTextForDates(String currentValue, StructuredDocumentTag sdt) throws Exception {
		// get the placeholder definition as additional formatting may be needed
		//boolean temp = sdt.isShowingPlaceholderText();
		if (sdt.getSdtType() == SdtType.DATE) {
			return "CONVERTED SOMEHOW"; // this would parse the recieved text not important right not.
		}
		return currentValue;
	}

	private static org.w3c.dom.Document getDocumentXml(Document doc) {
		try {
			ByteArrayOutputStream outStream = new ByteArrayOutputStream();
			doc.save(outStream, SaveFormat.DOCX);
			return DocumentPlaceholderXmlExtractor.getPlaceholderXML(new ByteArrayInputStream(outStream.toByteArray()));
		} catch (IOException | ParserConfigurationException | SAXException e ) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (Exception e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
		return null;
	}
}

This is how my docs look:
Before performing the removeSelfOnly()

After performing the removeSelfOnly() this text is comming from the related XML placeholder text.

@lconde57 It looks like your structured document tag has XML mapping. When you call removeSelfOnly SDT content is updated. Try to remove mapping before updating SDT value:

if (sdt.getXmlMapping().isMapped())
    sdt.getXmlMapping().delete();

If this does not help, please attach your input document here for testing. We will check the issue and provide you more information.

Your solution got me the behavior i wanted. Thank you.

1 Like

Sorry to revive the thread, just wanted to know the why of this behavior, documentation for removeSelfOnly() reads “Removes just this SDT node itself, but keeps the content of it inside the document tree.”, is the xpath text what counts as the content?

@lconde57 In your case content control’s value is taken from custom XML part using xpath. So before removing Aspose.Words updates the value of the content control.