Delete the blank pages at the end of document

Hi I am working on a function to delete blank page(s) at the end the document using C#.

I found a couple of support articles seems relevant https://forum.aspose.com/t/blank-page-at-the-end-of-the-finished-document/103623/17?u=dougt, and https://forum.aspose.com/t/how-to-remove-blank-pages-in-word-files-before-appending-them/48526/2. And I come up with the following code:

while (!_document.LastSection.Body.LastParagraph.HasChildNodes)
            {
                // According to Aspose support: "Please note that a document should always end in a paragraph, so in 
                // some instances you may find that during save an empty paragraph is added if for example the previous
                // node is a table."
                // so in this case we know that the previous node is a table for example, we don't want remote the last empty
                // section
                if (_document.LastSection.Body.LastParagraph.PreviousSibling != null &&
                _document.LastSection.Body.LastParagraph.PreviousSibling.NodeType != NodeType.Paragraph)
                {
                    break;
                }
                else
                {
                    _document.LastSection.Body.LastParagraph.Remove();
                    emptyTrailingPara++;
                }
                // If the current section becomes empty, we should remove it.
                if (!_document.LastSection.Body.HasChildNodes)
                {
                    _document.LastSection.Remove();
                    emptyTrailingSections++;
                }

                // Exit the loop if the document becomes empty.
                if (!_document.HasChildNodes)
                    break;
            }

But somehow the code based on the code example provided didn’t remove the trailing blank page at the end of table, so I have to remove the lines to check and break out if previousSibiling NodeType != NodeType.Paragraph, so it become to:

while (!_document.LastSection.Body.LastParagraph.HasChildNodes)
            {
                // According to Aspose support: "Please note that a document should always end in a paragraph, so in 
                // some instances you may find that during save an empty paragraph is added if for example the previous
                // node is a table."
                // so in this case we know that the previous node is a table for example, we don't want remote the last empty
                // section
                _document.LastSection.Body.LastParagraph.Remove();
                emptyTrailingPara++;
                // If the current section becomes empty, we should remove it.
                if (!_document.LastSection.Body.HasChildNodes)
                {
                    _document.LastSection.Remove();
                    emptyTrailingSections++;
                }

                // Exit the loop if the document becomes empty.
                if (!_document.HasChildNodes)
                    break;
            }

So in which cases the PreviousSibling.NodeType need to be checked, and am I right to remove the check in my situation, or could you provide better sample code handle all the situations.

Thanks

@DougT,

To ensure a timely and accurate response, please ZIP and attach the following resources here for testing:

  • Your input Word document
  • Aspose.Words generated output document showing the undesired behavior
  • Your expected document which shows the correct output. Please create this document by using Microsoft Word application.
  • Please create a standalone console application (source code without compilation errors) that helps us to reproduce your current problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you code to achieve the same by using Aspose.Words. Thanks for your cooperation.

Please see my provided example code and the blank page after the table is not removed.

@DougT,

I am afraid, we do not see any attachments in this thread. Please ZIP and upload the required resources here for testing. You may also upload the ZIP file to Dropbox and share the download link here. Thanks for your cooperation.

AsposeRemoveTrailingBlankPage.zip (72.2 KB)

Hi, Awais, the Aspose forum has some limitation/bug, as the original attachment includes the Nuget package for Aspose.Words, it was 81 MB and upload finished without showing any error and didn’t show the attachment either. I now upload only the CSproject folder rather than the whole solution folder. Let me know it works or not.

Thanks

@DougT,

I am afraid, you cannot remove the last empty Paragraph from document. However, you can workaround this issue by using the following code:

Document doc = new Document("D:\\Temp\\DocTabThenBlankPage.docx");

Paragraph para = doc.LastSection.Body.LastParagraph;

if (string.IsNullOrEmpty(para.ToString(SaveFormat.Text).Trim()))
{
    para.ParagraphBreakFont.Size = 1;
    para.ParagraphFormat.SpaceAfter = 0;
    para.ParagraphFormat.SpaceBefore = 0;
}

doc.Save("D:\\Temp\\18.7.docx");

Hi awais, thanks for the help, your workaround does seems achieved what we want, to make the empty paragrah not creating a empty page at the end the the document.

Out of curiosity is this behaviour “cannot remove the last empty Paragraph from document” part of how MS Words works, or limitations from Aspose.Words API?

Most importantly, can you please explain the code where I got from Aspose support forum:

if (_document.LastSection.Body.LastParagraph.PreviousSibling != null &&
_document.LastSection.Body.LastParagraph.PreviousSibling.NodeType != NodeType.Paragraph)

It seems leave some empty paragraph immediately following a table not removed which I think should be removed.

Thanks

@DougT,

I am afraid, we are unable to remove the last empty Paragraph even by using MS Word 2016. So, this is not a limitation of Aspose.Words but an expected behavior.

Secondly in the code, the ‘if condition’ will be true when in ‘node hierarichy’ the previous sibling Node of last Paragraph is not a Paragraph (Table etc).

One quick way to see the ‘node hierarichy’ is to use the DocumentExplorer example project.
Document Tree Navigation

Hi, thanks Awais for your help.

Yes, I am aware of the documentExplorer and found it indeed invaluable.

One more question, if you don’t mind:

About the code:

if (string.IsNullOrEmpty(para.ToString(SaveFormat.Text).Trim()))
{
para.ParagraphBreakFont.Size = 1;
para.ParagraphFormat.SpaceAfter = 0;
para.ParagraphFormat.SpaceBefore = 0;
}

Why change ParagrahBreakFont size, not the font size of the empty paragraph itself, and is there some Documentation/Help regarding ParagraphBreak? I don’t think I’ve come across it before and I can’t seems view it from either Word or DocumentExplorer?

Also SpaceAfter and SpaceBefore is the “spacing before/after” items in Word “Layout” tab isn’t it? It’s kind of misleading to me as Space could means a “Space” character before or after some words, but spacing means space above/below in the terminology of MS Word.

Again, thanks very much for your help.

@DougT,

Please refer to the this screenshot.

A Paragraph Break represents the end of paragraph character: “\x000d” or “\r”. Same as Cr. As shown in screenshot, in MS Word 2016, you can use the "Show/Hide paragraph marks and other hidden formatting symbols” command to view different Control Characters.

The SpaceAfter and SpaceBefore properties can be used to adjust the vertical spacing between Paragraphs. Please refer to the screenshot to know the equivalent of these in MS Word.

Thank youi Awais for the explanation, and the picture says thousands words indeed.

No more questions from me, I think.

Thanks again.

@DougT,

Thanks for your feedback. In case you have any further inquiries or need any help, please let us know.