How to remove additional empty paragraph\space without deleting the previous content bookmark

Hi,

Please help me to remove additional empty paragraph from word document.

I have attached sample file(TEST.doc) which does have additional empty space in between sections. I have tried to use below code approach to delete the empty space but its deleting the previous content bookmark[HD_CL_STD] too.

Document doc1 = new Document(pathToSaveContract + "TEST.doc");
removeExtraSpaceBeforeTitle(doc1, "PRICE VALUES");


public boolean removeExtraSpaceBeforeTitle(Document doc, String secTitle)
{
    try
    {
        // ================ Node[] paras = doc.getChildNodes(NodeType.PARAGRAPH, true)
        .toArray();
        int paraId = 0;
        for (int i = 0; i <paras.length; i++)
        {
            Paragraph p = (Paragraph) paras[i];
            if (p.getText() != null &&
                p.getText().trim().equalsIgnoreCase(secTitle))
            {
                System.out.println("Para (" + i + ")=" + p.getText());
                paraId = i;
                break;
            }
        }
        System.out.println("Para id=" + paraId);
        if (paraId> 0)
        {
            Paragraph para = (Paragraph) paras[paraId - 2];
            if (para.getChildNodes().getCount() == 0 || para.toString(SaveFormat.TEXT).trim().equals(""))
            {
                para.remove();
                System.out.println("Empty para Removed for " + secTitle);
            }
            else
            {
                return true;
            }
            removeExtraSpaceBeforeTitle(doc, secTitle);
        }

    }
    catch (Exception e)
    {
        e.printStackTrace();
    }
    return false;
}

Hi there,

Thanks for your inquiry. The BookmarkEnd node of bookmark “HD_CL_STD” is inside empty paragraph. In this case, we suggest you please insert the BookmarkEnd node in the previous node of it’s parent node. Please check the following code example. Hope this helps you.

Document doc = new Document(MyDir + "TEST.doc");

DocumentBuilder builder = new DocumentBuilder(doc);
Bookmark bm = doc.getRange().getBookmarks().get("HD_CL_STD");

Paragraph paragraph = (Paragraph) bm.getBookmarkEnd().getParentNode();
if (paragraph.getPreviousSibling() != null &&
    paragraph.getPreviousSibling().getNodeType() == NodeType.TABLE)
{
    Table table = (Table) bm.getBookmarkEnd().getParentNode().getPreviousSibling();
    Node node = table.getLastRow().getLastCell().getLastChild();
    builder.moveTo(node);
    builder.endBookmark("HD_CL_STD");
}

paragraph.remove();

doc.save(MyDir + "17.3.0.doc");

Hi,

Thanks for your quick response.

Your suggestion is working for one scenario, but other scenarios also need to cover. Sorry forget to explain the complete requirement in first query itself.

Inside [HD_CL_STD] bookmark, document having another bookmark also [SVC_SEC]. Sometime it will have many other bookmarks. So all these bookmarks should not delete after removing the empty space(s). And also mostly before the empty space(s), content will be TABLE or TEXT.

I have tried below code to manage the TABLE & TEXT objects and its removing the space if its single empty space only.

public void removeSpaces(Document doc)
{
    try
    {
        DocumentBuilder builder = new DocumentBuilder(doc);
        Bookmark bm = doc.getRange().getBookmarks().get("HD_CL_STD");

        Paragraph paragraph = (Paragraph) bm.getBookmarkEnd().getParentNode();
        if (paragraph.getPreviousSibling() != null &&
            paragraph.getPreviousSibling().getNodeType() == NodeType.TABLE)
        {
            Table table = (Table) bm.getBookmarkEnd().getParentNode().getPreviousSibling();
            Node node = table.getLastRow().getLastCell().getLastChild();
            builder.moveTo(node);
            builder.endBookmark("HD_CL_STD");
        }

        if (paragraph.getPreviousSibling() != null &&
            paragraph.getPreviousSibling().getNodeType() == NodeType.PARAGRAPH)
        {
            Paragraph para = (Paragraph) bm.getBookmarkEnd().getParentNode().getNextSibling();
            Node node = para.getLastChild();
            builder.moveTo(node);
            builder.endBookmark("HD_CL_STD");
        }
        paragraph.remove();
    }
    catch (Exception e)
    {
        e.printStackTrace();
    }
}

Please refer the attachment and help me to fix the issues.

Thanks & Regards,
Ananth

Hi there,

Thanks for your inquiry.

The BookmarkEnd node of bookmark [HD_CL_STD] and [SVC_SEC] are inside same bookmark. Please check the attach DOM image for detail. Please use the same approach to insert these BookmarkEnd nodes. Move the cursor to the previous node of paragraph that contains the BookmarkEnd nodes and insert these nodes using DocumentBuilder.endBookmark method.

Please let us know if you have any more queries.

Hi,

Please help me to fix below issues:

  1. Is there any way to list out all the bookmarks which all are having same BookmarkEnd node of [HD_CL_STD]. And then process for making the DocumentBuilder.endBookmark to all the BookmarkEnd nodes. Because [SVC_SEC] is not fixed one, sometime document will have some other bookmarks.

  2. Below method is working fine if its single empty line. And please note that I have used getNextSibling() method only, because getPreviousSibling() is not working properly.

  3. If i have more then one empty lines, below method is not working. And its throwing “java.lang.IllegalArgumentException: node” exception. NODE object getting null. Please provide the code snippet to handle the multiple empty lines (some time it will be 2 or 3 or 4 empty lines).

public void removeSpaces(Document doc)
{
    try
    {
        DocumentBuilder builder = new DocumentBuilder(doc);

        Bookmark bmS = doc.getRange().getBookmarks().get("SVC_SEC");
        Paragraph paragraphS = (Paragraph) bmS.getBookmarkEnd().getParentNode();
        if (paragraphS.getNextSibling() != null &&
            paragraphS.getNextSibling().getNodeType() == NodeType.PARAGRAPH)
        {
            Paragraph para = (Paragraph) bmS.getBookmarkEnd().getParentNode().getNextSibling();
            Node node = para.getLastChild();
            builder.moveTo(node);
            builder.endBookmark("SVC_SEC");
        }
        Bookmark bm = doc.getRange().getBookmarks().get("HD_CL_STD");
        Paragraph paragraph = (Paragraph) bm.getBookmarkEnd().getParentNode();
        if (paragraph.getNextSibling() != null &&
            paragraph.getNextSibling().getNodeType() == NodeType.PARAGRAPH)
        {
            Paragraph para = (Paragraph) bm.getBookmarkEnd().getParentNode().getNextSibling();
            Node node = para.getLastChild();
            builder.moveTo(node);
            builder.endBookmark("HD_CL_STD");
        }
        paragraph.remove();
    }
    catch (Exception e)
    {
        e.printStackTrace();
    }
}

Please refer the attachments for the input details.

Thanks & Regards
Ananth

Hi Ananth,

Thanks for your inquiry. Please use following code example to get the desired output. Hope this helps you.

This code example does the followings:

  1. Get the Paragraph node that contains the BookmarkEnd node of [HD_CL_STD].
  2. Get all BookmarkEnd nodes that are in this Paragraph.
  3. Get the previous Paragraph node of this Paragraph (get in step 1).
  4. Insert BookmarkEnd nodes to Paragraph get in step 3.
  5. Get all empty paragraphs after BookmarkEnd [HD_CL_STD] and remove them.
Document doc = new Document(MyDir + "Sample_Input_I_with_many_emptyLines.doc");

DocumentBuilder builder = new DocumentBuilder(doc);
Bookmark bm = doc.getRange().getBookmarks().get("HD_CL_STD");

Paragraph paragraph = (Paragraph) bm.getBookmarkEnd().getParentNode();

// Check if all nodes of paragraph has node type BookmarkEnd.
if (paragraph.getChildNodes().getCount() == paragraph.getChildNodes(NodeType.BOOKMARK_END, true).getCount())
{
    //Get the previous paragraph node
    Node node = paragraph;
    while (true)
    {
        node = node.previousPreOrder(doc);
        if (node.getNodeType() == NodeType.PARAGRAPH || node.getNodeType() == NodeType.BODY)
            break;
    }

    //Get all BookmarkEnd nodes and insert them into previous paragraph.
    if (node.getNodeType() == NodeType.PARAGRAPH)
        for (BookmarkEnd bEnd: (Iterable) paragraph.getChildNodes(NodeType.BOOKMARK_END, true))
        {
            ((Paragraph) node).appendChild(new BookmarkEnd(doc, bEnd.getName()));
        }
}

// Remove empty paragraphs
ArrayList emptylines = new ArrayList();
emptylines.add(paragraph);
Node node = paragraph;
while (true)
{
    if (node.getNextSibling() != null &&
        node.getNextSibling().getNodeType() == NodeType.PARAGRAPH &&
        ((Paragraph) node.getNextSibling()).getChildNodes().getCount() == 0)
    {
        emptylines.add((node.getNextSibling()));
        node = node.getNextSibling();
    }
    else
        break;

}

for (Node para: (Iterable) emptylines)
{
    para.remove();
}

doc.save(MyDir + "Out.doc");

Hi Tahir,

Thanks lot. Suggested fixes are working fine. And so far we did not face any issues relate to this fix if anything is there will reach you again.

Thanks for your support.

Regards,
Ananth

Hi Ananth,

Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

Hey there people, thanks a lot for all your answers. I already got lots of headaches because of this (no really, i even used an Analgin) but I do seemed I found some useful information here… thanks again!!