Removing blank pages from a document

Hi,
In our application we perform many operations on word documents. After all the operations, in the output documents blank pages are coming.
Can you help us to remove blank pages with header footer information from a document?
e.g. a document has 3 sections and in each section there is a blank page with header footer information. We want to remove that blank page alone from all the sections of document.
Kindly reply ASAP.
Thank you…

Hi there,

Thanks for your inquiry. Please use the following code example to achieve your requirements. I suggest you please read following documentation links for your kind reference.
https://reference.aspose.com/words/net/aspose.words.layout/layoutcollector/
https://reference.aspose.com/words/net/aspose.words.layout/layoutenumerator/

Hope this helps you. Please let us know if you have any more queries.

Document doc = new Document(MyDir + "in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
String PageText = "";
LayoutCollector lc = new LayoutCollector(doc);
int pages = lc.GetStartPageIndex(doc.LastSection.Body.LastParagraph);
for (int i = 1; i <= pages; i++)
{
    ArrayList nodes = GetNodesByPage(i, doc);
    foreach (Paragraph para in nodes)
    {
        PageText += para.ToString(SaveFormat.Text).Trim();
    }
    // Empty Page
    if (PageText == "")
    {
        foreach (Node node in nodes)
        {
            node.Remove();
        }
    }
    nodes.Clear();
    PageText = "";
}
doc.Save(MyDir + "Out.docx");
// Get Paragraph nodes by page number
private ArrayList GetNodesByPage(int page, Document document)
{
    ArrayList nodes = new ArrayList();
    LayoutCollector lc = new LayoutCollector(document);
    foreach (Paragraph para in document.GetChildNodes(NodeType.Paragraph, true))
    {
        if (lc.GetStartPageIndex(para) == page)
            nodes.Add(para);
    }
    return nodes;
}

Hi,
We are using 10.8.0.0 and LayoutCollector class is not there in this version.
Please let me know how i can remove blank pages in a document using 10.8.0.0 version of aspose.words.
Thank You…

Hi there,

Thanks for your inquiry. Please use the latest version of Aspose.Words for .NET 14.1.0 to use the LayoutCollector class.

You can achieve your requirement simply by removing the empty paragraphs from your document. Please see the following code:

Document doc = new Document(@"C:\Temp\input.docx");
foreach (Section sec in doc.Sections)
{
    NodeCollection bodyParas = sec.Body.GetChildNodes(NodeType.Paragraph, true);
    foreach (Paragraph para in bodyParas)
        if (!para.HasChildNodes)
            para.Remove();
}
doc.Save(@"C:\Temp\out.docx");

I hope, this helps.

Hi,
Now we are using aspose.words for .NET 14.1.0 and i tried to use the code which is suggested by you using LayoutCollector class with the help of following piece of code

Document doc = newDocument(MyDir + "in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
String PageText = "";
LayoutCollector lc = newLayoutCollector(doc);
int pages = lc.GetStartPageIndex(doc.LastSection.Body.LastParagraph);
for (int i = 1; i <= pages; i++)
{
    ArrayList nodes = GetNodesByPage(i, doc);
    foreach (Paragraphpara in nodes)
    {
        PageText += para.ToString(SaveFormat.Text).Trim();
    }
    // Empty Page
    if (PageText == "")
    {
        foreach (Node nodein nodes)
        {
            node.Remove();
        }
    }
    nodes.Clear();
    PageText = "";
}
doc.Save(MyDir + "Out.docx");
// Get Paragraph nodes by page number
private ArrayListGetNodesByPage(int page, Document document)
{
    ArrayList nodes = newArrayList();
    LayoutCollector lc = newLayoutCollector(document);
    foreach (Paragraphpara in document.GetChildNodes(NodeType.Paragraph, true))
    {
        if (lc.GetStartPageIndex(para) == page)
            nodes.Add(para);
    }
    return nodes;
}

But this is not working because for an empty page number of nodes are zero.
Please let me know how can i delete empty page with zero nodes.
And also even if number of nodes of an empty page is not zero, the above code is not removing the blank page.
Please let me know what modifications have to be done to the above code to make it work.
For your reference i am attatching a source document and in this second page in section 1 is empty and this page has to be removed in destination document. Kindly let me know how can i delete empty page in the above two cases(When nodes in empty page is 0 and when nodes in empty page is not equla to 0).

Hi there,

Thanks for your inquiry. In your case, you have two sections in your document. The second section have one empty Paragraph. In this case, please remove the section which contains no data from the document. Please check the following highlighted code snippet. I have attached the output document with this post for your kind reference.

Please let us know if you have any more queries.

Document doc = new Document(MyDir + "Source.doc");
foreach (Section section in doc.Sections)
{
    if (section.ToString(SaveFormat.Text).Trim() == String.Empty)
        section.Remove();
}
DocumentBuilder builder = new DocumentBuilder(doc);
String PageText = "";
LayoutCollector lc = new LayoutCollector(doc);
int pages = lc.GetStartPageIndex(doc.LastSection.Body.LastParagraph);
for (int i = 1; i <= pages; i++)
{
    ArrayList nodes = GetNodesByPage(i, doc);
    foreach (Paragraph para in nodes)
    {
        PageText += para.ToString(SaveFormat.Text).Trim();
    }
    // Empty Page
    if (PageText == "")
    {
        foreach (Node node in nodes)
        {
            node.Remove();
        }
    }
    nodes.Clear();
    PageText = "";
}
doc.Save(MyDir + "Out.docx");

Hi,
Thanks for your reply.
I am able to remove blank pages which are present in seperate section but not able to remove blank pages which are in a section in which some other text is there.
For instance in the attatched document there are two sections. In first section 1st page has some content and 2nd page is empty. I want to remove the empty page present in 1st section and i tried the code suggested by you but it is not working.
PFA the souce document. Kindly tell me how can i remove blank pages which are present in a section which contains some other pages with content.

Hi there,

Thanks for your inquiry. I have modified the GetNodesByPage method. Please check the highlighted changes below. Hope this helps you. Please let us know if you still face any issue.

// Get Paragraph nodes by page number
private ArrayList GetNodesByPage(int page, Document document)
{
    ArrayList nodes = new ArrayList();
    LayoutCollector lc = new LayoutCollector(document);
    foreach (Paragraph para in document.GetChildNodes(NodeType.Paragraph, true))
    {
        if (lc.GetStartPageIndex(para) == page || para.IsEndOfSection)
            nodes.Add(para);
    }
    return nodes;
}