Removing empty page in the doc with multiple Sections with pageBreaks

Hi Aspose team

We are working to remove a blank/empty page in the middle & end of the doc, which has multiple sections. Hope it is achievable with LayoutCollector class (with latest Aspose version).
Need the page breaks also(if the text is ending in the middle of the page) to be intact, but remove only blank page

Am not able to remove the page. please help here .

Attached the input & output Docs with the code file

Thanks
goutham

Hi Goutham,

Thanks for your inquiry. We are looking into it and will update you shortly.

Best Regards,

hi any updates?

Hi Goutham,

Please accept my apologies for late response.

To remove the empty pages from the end of document, please use following code example.

// Remove empty pages from the end of document
while (doc.getLastSection().getBody().getLastParagraph().toString(SaveFormat.TEXT).trim().equals(""))
{
    if (doc.getLastSection().getBody().getLastParagraph().getPreviousSibling() != null &&
            doc.getLastSection().getBody().getLastParagraph().getNodeType() != NodeType.PARAGRAPH)
        break;
    doc.getLastSection().getBody().getLastParagraph().remove();

    // If the current section becomes empty, we should remove it.
    if (!doc.getLastSection().getBody().hasChildNodes())
        doc.getLastSection().remove();

    // We should exit the loop if the document becomes empty.
    if (!doc.hasChildNodes())
        break;
}

doc.save(MyDir + "Output.doc");

The shared code example in your first post does not remove empty pages that contain the explicit page beak. E.g. there is section break and page break on second page of input document. If you want to remove it, you need to remove the page breaks from the document. If this is the case, please replace following code snippet

if(para.getText().contains(ControlChar.PAGE_BREAK))
{
    PageText = "Page Break";
    break;
}

with

if(para.getText().contains(ControlChar.PAGE_BREAK))
    para.getRange().replace(ControlChar.PAGE_BREAK, "", new FindReplaceOptions());

If you still face problem, please share your expected output document here for our reference. We will then provide you more information about your query along with code.

hi Tahir

We cannot remove those page breaks. We have a requirement to start some sections in new page, so we have given page breaks for them. (eg: if section2 is ending at middle of a page, the next section should start from a new page,so we have page break for that)

So , is there any way with Aspose tool, we can remove only the page break which is occuring at start of the blank page? (that is ultimately removing the blank page in middle of a doc)

Attaching the expected output doc for ref – “ExpectedOutput.doc”

thanks
goutham

Hi Goutham,

Thanks for your inquiry. Please use following modified code example to get the desired output. Hope this helps you.

Document doc = new Document(MyDir + "Input.doc");
doc.setTrackRevisions(false);
doc.updatePageLayout();
for (Section sec : doc.getSections())
{
    if (sec.toString(SaveFormat.TEXT).trim().isEmpty())
    {
        sec.remove();
    }

}

Boolean PageBreak = false; String PageText = "";
LayoutCollector lc = new LayoutCollector(doc);
int pages = doc.getPageCount();
ArrayList removenodes = new ArrayList();
ArrayList pagebreaknodes = new ArrayList(); for (int i = 1; i <= pages; i++)
{
    PageBreak = false; PageText = "";
    ArrayList nodes = GetNodesByPage(i, doc);
    for (Paragraph para : nodes)
    {
        if (para.getText().contains(ControlChar.PAGE_BREAK))
        {
            PageBreak = true;
        }
        PageText += para.toString(SaveFormat.TEXT).trim();
    }

    // If page's text is empty and there is only page break
    // then remove the page break
    if (PageText.length() == 0 && PageBreak == true)
    {
        for (Node node : nodes)
        {
            if (node.getText().contains(ControlChar.PAGE_BREAK))
            {
                pagebreaknodes.add(node);
            }
        }
        PageBreak = false;
    }

    if (PageText.equals("")) //Empty Page
    {
        for (Node node : nodes)
        {
            removenodes.add(node);
        }
    }
    nodes.clear();
}

//Remove nodes from empty pages
for (Node node : removenodes)
{
    node.remove();
}

//Remove page break
for (Node node : pagebreaknodes)
{
    node.getRange().replace(ControlChar.PAGE_BREAK, "", new FindReplaceOptions());
}

//Remove empty pages from the end of document
while (doc.getLastSection().getBody().getLastParagraph().toString(SaveFormat.TEXT).trim().equals(""))
{
    if (doc.getLastSection().getBody().getLastParagraph().getPreviousSibling() != null &&
    doc.getLastSection().getBody().getLastParagraph().getNodeType() != NodeType.PARAGRAPH)

        break;
    doc.getLastSection().getBody().getLastParagraph().remove();
    // If the current section becomes empty, we should remove it.

    if (!doc.getLastSection().getBody().hasChildNodes())

        doc.getLastSection().remove();
    // We should exit the loop if the document becomes empty.

    if (!doc.hasChildNodes())

        break;
}

doc.save(MyDir + "Output.doc");

thanks Tahir
now able to remove the blank page.

Hi Goutham,

Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.