Need to retain paragraphs while merging 2 consecutive SDT

Hi Team,

I want to merge 2 consecutive Structured document tags into 1 tag while maintaining the spacing between the paragraphs
Currently am using the code below for merging

public static void main(String[] args) throws Exception {
        
    Document document = new Document("..\\158464.docx");
       
    StructuredDocumentTag prevStd = null;
    for (Object st : document.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
        StructuredDocumentTag std = (StructuredDocumentTag) st;

            
        if (prevStd == null) {
            prevStd = std;
            continue;
        }

        if ((std.getSdtType() == SdtType.RICH_TEXT)) {
            Boolean consecutiveNode = false;
            //Check for nodes between stds , this is to check of we can merge stds or not.
            consecutiveNode = true;
            List intermediateNodes = null;
            try {
                intermediateNodes = new BookMarkNodeExtractorService().extractContent(prevStd, std, false);
            } catch (Exception ex) {
                //LOGGER.warn("not able to extract bookmark node in the StructuredDocumentTag due to : ", ex);
                consecutiveNode = false;
            }
            if (!ListUtil.isEmpty(intermediateNodes)) {
                for (Object node : intermediateNodes) {
                    if (!StringUtil.isEmpty(((Node) node).getText().trim())) {
                        consecutiveNode = false;
                        break;
                    }
                }
            }
            if (Boolean.TRUE.equals(consecutiveNode)) {
                prevStd.appendChild(std);
                std.removeSelfOnly();
            }
        }
    }
        
    document.save("..\\158464_Test.docx");
    System.out.println("Doc generated");
}

but the new merged SDT is not maintaining the spaces and lines between consecutive SDTs
Attaching the input and output documents

please suggest the best way to retain spacing and lines between SDTs

input document -
158464.docx (33.9 KB)

output document
158464_Test.docx (27.4 KB)

@am.it.chauhan The spacing you are talking about is empty paragraphs between SDTs:

In your code you are moving only content of one SDT to another, but the paragraph between these SDTs is not moved.

Hi @alexey.noskov
Can you please provide me the efficient way to merge empty paragraphs too while appending SDT to prev SDT

@am.it.chauhan You can simple add an empty paragraphs between SDTs content when concatenate them. Something like this:

Document doc = new Document("C:\\Temp\\in.docx");
// Get structured document tags that should be concatenated.
Node[] sdts = doc.getFirstSection().getBody().getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, false).toArray();
StructuredDocumentTag first = (StructuredDocumentTag)sdts[0];
for (int i = 1; i < sdts.length; i++)
{
    StructuredDocumentTag next = (StructuredDocumentTag)sdts[i];
    // Add an empty paragraph to the main sdt.
    if (first.getNextSibling() != null &&
            first.getNextSibling().getNodeType() == NodeType.PARAGRAPH &&
            !((Paragraph)first.getNextSibling()).hasChildNodes())
    {
        first.appendChild(first.getNextSibling());
    }
    // Copy content from the next SDT and remove it.
    while (next.hasChildNodes())
    {
        first.appendChild(next.getFirstChild());
    }
    next.remove();
}
doc.save("C:\\Temp\\out.docx");

Hi alexey

Thanks for the prompt help
it worked

1 Like