Hello,
I am using Aspose Words for Java version 24.6.
I am extracting all the tags defined in a Word template. Then, I process them to check if the expressions used in the tags are valid as per model rules.
To do so, I expect that the tags are retrieved in the order they are used.
Some of the tags could also define variables. Variables are defined first and then used.
So, to resolve the variable value, I depend on the order these tags appear in the Word document.
And I found that the order is maintained as long as they are defined in the body and used in the body. But when they are used inside the footer, I get tags in a different order on Windows (local) and Linux machine (dev server).
On a Windows machine, the output is:
{PARAGRAPH=[<<[root.Name]>>, <<var [name = user.Name]>>, <<var [code = root.Code]>>, <<var [footerVar = name + “ ” + code]>>, <<[root.Name] >>, <<[ root.Code]>>, <<[root.Description]>>, <<[footerVar]>>]}
Where as on the linux machine, the output is:
{PARAGRAPH=[<<[footerVar]>>], <<[root.Name]>>, <<var [name = user.Name]>>, <<var [code = root.Code]>>, <<var [footerVar = name + “ ” + code]>>, <<[root.Name] >>, <<[ root.Code]>>, <<[root.Description]>>}
On the UNIX machine, the script tag using the variable appears first in the retrieved tags than its variable declaration.
I understand that there are different types of header/footer (first, primary, even/odd).
The users are likely to use any type of header and/or footer and define variables and then use those variables. The order of the tags is important for me to perform model verification.
Please guide me on how to handle this.
PFA the template.
Var_footer.docx (31 KB)
Logic to extract the tags:
private void testVariables() {
final File templateFile =
new File(
"path\\to\\input\\word\\templates",
"Var_footer.docx");
try (final InputStream inputStream = new FileInputStream(templateFile); ) {
final Document doc = new Document(inputStream);
final NodeCollection<?> childNodes = doc.getChildNodes(NodeType.PARAGRAPH, true);
final Map<String, List<String>> result = new LinkedHashMap<>();
final ArrayList<String> tagsCollected = new ArrayList<>();
result.put(NodeType.getName(NodeType.PARAGRAPH), tagsCollected);
Node node;
String nodeText;
String parentNodeType;
List<String> extractedTags;
for (int i = 0; i < childNodes.getCount(); i++) {
node = childNodes.get(i);
nodeText = StringUtils.trimToEmpty(node.getText());
parentNodeType = NodeType.getName(node.getParentNode().getNodeType());
if (nodeText.isEmpty()) {
continue;
}
log.debug("{} inside {}", nodeText, parentNodeType);
final String inputWithSingleWhitespace =
StringUtils.normalizeSpace(StringUtils.trimToEmpty(nodeText));
final Matcher matcher =
Pattern.compile("(?-s)<<.+?>>", Pattern.DOTALL).matcher(inputWithSingleWhitespace);
extractedTags = matcher.results().map(MatchResult::group).toList();
if (CollectionUtils.isEmpty(extractedTags)) {
continue;
}
tagsCollected.addAll(extractedTags);
}
log.debug("Node text: {}", result);
} catch (Exception e) {
throw new RuntimeException(e);
}
}