How to remove blank space and blank lines from the beginning of the document and End of the Document

TestDocNew.zip (11.8 KB)

Please find the attachment of word document in .dotx format.

We have a requirement in .net application that we merge multiple documents(known as Child Documents .dotx format) in a single documents(known as Parent Document .dotx format),

So if child documents contains some extra blank space at the beginning of the document and at the end of the document then we need to remove those space before merge child documents in the parent documents. I have tried to remove the extra space by using some help which already available at the forum. but they does not remove blank space from top and bottom of the document.

I have used below url:
How to remove blank lines from word document
Remove blank lines at end of word document after main content - #2 by tahir.manzoor

I have tried by using Find and replace function also, but blank space not removed.

I have attached a sample document(child document) for your better understanding.By selecting the text you can see the blank spaces in the document.

Thanks.

We are using Aspose Word dll version 19.7.0.0

@amanjainmccalla

Please use the following code example to remove the empty space at the start and end of document. Hope this helps you.

Document document = new Document(MyDir + @"TestDocNew.dotx");

while (document.FirstSection.Body.FirstParagraph.ToString(SaveFormat.Text).Trim().Equals("")
    && document.FirstSection.Body.FirstParagraph.GetChildNodes(NodeType.Shape, true).Count == 0
    && document.FirstSection.Body.FirstParagraph.GetChildNodes(NodeType.GroupShape, true).Count == 0)
{
    document.FirstSection.Body.FirstParagraph.Remove();
}

while (document.LastSection.Body.LastParagraph.ToString(SaveFormat.Text).Trim().Equals("")
    && document.LastSection.Body.LastParagraph.GetChildNodes(NodeType.Shape, true).Count == 0
    && document.LastSection.Body.LastParagraph.GetChildNodes(NodeType.GroupShape, true).Count == 0)
{
    if (document.LastSection.Body.LastParagraph.PreviousSibling != null &&
        document.LastSection.Body.LastParagraph.PreviousSibling.NodeType != NodeType.Paragraph)
        break;
    document.LastSection.Body.LastParagraph.Remove();

    // If the current section becomes empty, we should remove it.
    if (!document.LastSection.Body.HasChildNodes)
        document.LastSection.Remove();

    // We should exit the loop if the document becomes empty.
    if (!document.HasChildNodes)
        break;
}

document.Save(MyDir + "output.docx");

fc_caption_new.zip (15.8 KB)

Thanks for the reply Tahir,

it’s almost working fine for removing blank space of beginning of the document. but to remove space end of the document is making trouble due to some other node at the end of the document expect paragraph node.

document.LastSection.Body.LastParagraph.PreviousSibling.NodeType can be vary from document to document. it not always paragraph, some time it’s Bookmark, table etc.

Please process this attached document with the code give by you.

@amanjainmccalla

Your document contains tabs in the last paragraph. You can remove these tabs using following code example. Hope this helps you.

Document document = new Document(MyDir + @"TestDocNew.dotx");

while (document.FirstSection.Body.FirstParagraph.ToString(SaveFormat.Text).Trim().Equals("")
    && document.FirstSection.Body.FirstParagraph.GetChildNodes(NodeType.Shape, true).Count == 0
    && document.FirstSection.Body.FirstParagraph.GetChildNodes(NodeType.GroupShape, true).Count == 0)
{
    document.FirstSection.Body.FirstParagraph.Remove();
}

while (document.LastSection.Body.LastParagraph.ToString(SaveFormat.Text).Trim().Equals("")
    && document.LastSection.Body.LastParagraph.GetChildNodes(NodeType.Shape, true).Count == 0
    && document.LastSection.Body.LastParagraph.GetChildNodes(NodeType.GroupShape, true).Count == 0)
{
    if (document.LastSection.Body.LastParagraph.PreviousSibling != null &&
        document.LastSection.Body.LastParagraph.PreviousSibling.NodeType != NodeType.Paragraph)
        break;
    document.LastSection.Body.LastParagraph.Remove();

    // If the current section becomes empty, we should remove it.
    if (!document.LastSection.Body.HasChildNodes)
        document.LastSection.Remove();

    // We should exit the loop if the document becomes empty.
    if (!document.HasChildNodes)
        break;
}

Paragraph paragraph = document.LastSection.Body.LastParagraph;
int runcount = paragraph.Runs.Count;
while (paragraph.Runs[paragraph.Runs.Count -1 ].ToString(SaveFormat.Text).Trim().Length == 0)
{
    paragraph.Runs[paragraph.Runs.Count-1].Remove();
}
                  
document.Save(MyDir + "output.docx");