I am using bookmark to remove the content from the pages, it is working fine but header and footer are not getting removed and thus i am getting multiple blank pages with just header and footer content.
doc.Range.Bookmarks[bookmarkName].Text = "";
foreach (Section section in doc.Sections)
{
if (section.ToString(SaveFormat.Text).Trim() == String.Empty)
section.Remove();
}
To remove blank pages that only contain header and footer content from your document, you need to ensure that you are also removing the headers and footers in addition to the content. The code snippet you provided only clears the text from the bookmark and removes sections that are empty, but it does not address the headers and footers.
Here’s how you can modify your approach to remove the headers and footers from each section of the document:
Iterate through each section of the document.
For each section, check and remove the headers and footers of all types (first, primary, and even).
After removing the headers and footers, check if the section is empty and remove it if necessary.
Here is a sample code snippet that demonstrates this process:
import com.aspose.words.Document;
import com.aspose.words.HeaderFooter;
import com.aspose.words.HeaderFooterType;
import com.aspose.words.Section;
public class RemoveBlankPages {
public static void main(String[] args) throws Exception {
Document doc = new Document("path/to/your/document.docx");
for (Section section : doc.getSections()) {
// Remove all types of headers and footers
for (HeaderFooterType type : HeaderFooterType.values()) {
HeaderFooter header = section.getHeadersFooters().getByHeaderFooterType(type);
if (header != null) {
header.remove();
}
}
}
// Now remove empty sections
for (int i = doc.getSections().getCount() - 1; i >= 0; i--) {
Section section = doc.getSections().get(i);
if (section.toString(SaveFormat.Text).trim().isEmpty()) {
section.remove();
}
}
doc.save("path/to/your/cleaned_document.docx");
System.out.println("Done.");
}
}
In this code, we first remove all headers and footers from each section. Then, we check if any section is empty (after removing the headers and footers) and remove it if it is. This should help you eliminate the blank pages that only contain headers and footers.
Make sure to replace "path/to/your/document.docx" and "path/to/your/cleaned_document.docx" with the actual paths to your documents.
This approach should effectively clean up your document by removing unwanted blank pages.
I have 24 pages , and i want to remove the page along with header and footer only where there is no content, meaning the pages where there is only header and footer and no content in between, but header footer should not be removed from other pages where there is some page content. Here is the file attached, i want to remove the last two pages as there is no content. And I am using c#. offerletter_hcpqa_applicationid_131_101200744am_13_12_2024_10_49_6_722.pdf (256.6 KB)
@SachinSingh Do you use PDF document as an input? Aspose.Words is designed wot work with MS Word document at first place. Though Aspose.Words can load PDF documents, PDF and MS Word document models are very different and there might be nuances when PDF documents are loaded into Aspose.Words.
If you use MS Word document as an input, please attach it here for testing. Have you tried using a built in method Document.RemoveBlankPages?
Can you give me an example, as i am not getting any such method, i tried with doc.RemoveBlankPages() but getting error as method doesn’t exists.
Yes i am using word document only, after merging the tags i just download it as pdf.
@SachinSingh Word documents are not fixed page formats, they are flow, more like HTML. So, there is no easy way to determine where page starts or ends and no easy way to determine whether some particular page is blank. However, there are few options to set an explicit page break in Word document. For example,