Unable to delete blank pages

I am currently working with Aspose and facing an issue while trying to delete an empty last page from a document containing tables. As per my analysis, there are 5 pages in total, with the last page appearing to be empty when viewed in MS Word. However, upon debugging the code in Aspose, I noticed that the last page still contains data from the 4th page.

Could you please assist me in resolving this issue? Your expertise would be greatly appreciated, as I’m looking to successfully remove the last page without any residual content.

My codes are below:

try
{
    Aspose.Words.Document docBlankPage = new Aspose.Words.Document(sPath);
    ArrayList empty_Page_Numbers = new ArrayList();
    int total_Pages = docBlankPage.PageCount;
    for (int i = 0; i < total_Pages; i++)
    {
        Aspose.Words.Document one_Page_Doc = docBlankPage.ExtractPages(i, 1);

        int shape_Count = 0;
        int bookmarkCount = 0;
        string text_Of_Page = "";

        foreach (Shape shape in docBlankPage.GetChildNodes(Aspose.Words.NodeType.Shape, true))
        {
            if (shape.HasImage)
            {
                shape_Count++;
                break;
            }
        }

        foreach (Aspose.Words.Section section in one_Page_Doc.Sections)
        {
            text_Of_Page = text_Of_Page + section.Body.ToString(Aspose.Words.SaveFormat.Text);
            if (!string.IsNullOrEmpty(text_Of_Page.Trim()) || shape_Count > 0)
                break;
        }
        if (string.IsNullOrEmpty(text_Of_Page.Trim()) && shape_Count == 0 && bookmarkCount == 0)
        {
            empty_Page_Numbers.Add(i + 1);
        }
    }
    Aspose.Words.Layout.LayoutCollector layoutCollector = new Aspose.Words.Layout.LayoutCollector(docBlankPage);
    ArrayList list = new ArrayList();
    foreach (Aspose.Words.Node node in docBlankPage.GetChildNodes(Aspose.Words.NodeType.Any, true))
    {
        if (layoutCollector.GetNumPagesSpanned(node) == 0)
        {
            int pageIndex = layoutCollector.GetStartPageIndex(node);
            if (empty_Page_Numbers.Contains(pageIndex))
            {
                node.Remove();
            }
        }
    }

    docBlankPage.Save(sPath);
}
catch (Exception) { }
#endregion Remove Blank Page

@Jayarajmca

To successfully delete a blank last page from your document using Aspose.Words, you need to ensure that the page is truly empty and does not contain any residual content from the previous page. Here are some steps and code adjustments you can consider:

  1. Check for Content: Ensure that you are accurately checking for any content on the last page. Sometimes, even if a page appears blank in MS Word, it may still contain hidden elements or formatting that prevent it from being considered empty.

  2. Use LayoutCollector: The LayoutCollector class can help you determine the layout of the document and identify if a page is blank. You can use it to check if any nodes span across the last page.

  3. Modify Your Code: Here’s a refined version of your code that focuses on checking the last page specifically and removing it if it is indeed empty:

try
{
    Aspose.Words.Document doc = new Aspose.Words.Document(sPath);
    Aspose.Words.Layout.LayoutCollector layoutCollector = new Aspose.Words.Layout.LayoutCollector(doc);
    
    // Check if the last page is blank
    int lastPageIndex = doc.PageCount - 1;
    bool isLastPageBlank = true;

    foreach (Aspose.Words.Node node in doc.GetChildNodes(Aspose.Words.NodeType.Any, true))
    {
        int pageIndex = layoutCollector.GetStartPageIndex(node);
        if (pageIndex == lastPageIndex && !string.IsNullOrWhiteSpace(node.GetText().Trim()))
        {
            isLastPageBlank = false;
            break;
        }
    }

    // If the last page is blank, remove it
    if (isLastPageBlank)
    {
        doc.ExtractPages(lastPageIndex, 1);
    }

    doc.Save(sPath);
}
catch (Exception ex)
{
    // Handle exceptions
}

Key Points:

  • The code checks each node in the document to see if it belongs to the last page and contains any text.
  • If the last page is confirmed to be blank, it is removed using the ExtractPages method.

Additional Considerations:

  • Ensure that there are no explicit page breaks or section breaks that might be causing the last page to appear blank.
  • If you are still facing issues, consider reviewing the document structure in detail to identify any hidden elements.

This approach should help you effectively remove the last page if it is indeed empty. If you continue to encounter problems, please provide more details about the document structure or any specific errors you are facing.

@Jayarajmca Could you please attach your input and problematic output documents here for testing? We will check the issue and provide you more information.
As you may know, MS Word documents are flow by their nature and there is no “page” concept. The consumer applications reflows the document’s content into pages on the fly. So document layout can differ on different machines, for example, if some fonts are not available.