I am appending a cover page to a main document, each individual document has no extra pages or breaks. When I append the first document to the second document there are extra pages or breaks.
Document srcDoc = new Document("1.docx");
Document dstDoc = new Document("2.docx");
dstDoc.AppendDocument(srcDoc, ImportFormatMode.KeepSourceFormatting);
dstDoc.Save(ArtifactsDir + "Document.AppendDocument.docx");
@smooney1234 Could you please attach your source documents here for testing? We will check the issue and provide you more information. In the attached ZIP file I see only the output document produced by 19.5 version of Aspose.Words.
@smooney1234 Thank you for additional information. The problem is not reproducible with the attached source document and the latest 22.6 version of Aspose.Words. I will wait for your original files.
Could you please also try with the latest version of Aspose.Words on your side?
@smooney1234 Thank you for additional information. I have managed to reproduce the problem on my side. But it looks like this is not a bug in Aspose.Words, but an issue in the COPY_UMEM22382928.doc document. If simply open/save this document in MS Word the same empty pages are added in the document. It looks like the document has some issues, which are silently resolved by Aspose.Words and MS Word, but this adds page breaks.
@smooney1234 Do you mean that the documents you have attached was originally produced by Aspose.PDF by conversion from PDF to DOC? If you use .NET Framework 4.6.1 or newer, .NET Core 2.0 or newer, or .NET5 or newer, you can load PDF documents directly into Aspose.Words.Document object. https://docs.aspose.com/words/net/convert-pdf-to-other-document-formats/
Could you please attach your source PDF document here, so we can test with the original document.
Regarding Aspose.PDF, you should ask in the appropriate Aspose.PDF support forum.
@smooney1234 The size of image is not the problem. The redundant page break is generated by a redundant page break inserted at the end of section break, which also generates a page break:
To resolve the problem you can remove this redundant page break at the end of section:
Document doc = new Document(@"C:\Temp\COPY_UMEM22382928.doc");
NodeCollection runs = doc.GetChildNodes(NodeType.Run, true);
foreach (Run r in runs)
{
// get next paragraph
Paragraph nextPara = r.ParentParagraph.NextSibling as Paragraph;
if ((r.Text == ControlChar.PageBreak) &&
(nextPara != null) &&
nextPara.IsEndOfSection &&
(r == r.ParentParagraph.LastChild))
{
r.Remove();
}
}
doc.Save(@"C:\Temp\out.doc");
@smooney1234 Sure, I have tested the code with your COPY_UMEM22382928.doc document and the code removes 6 redundant page breaks from the document. After saving the document does not have redundant empty pages.