I want to merge two word documents with a table of contents, but I currently have a few questions that need help.
- How do I merge the table of contents of the second document into the table of contents of the main document
- How do I delete the original table of contents of the merged document
- How do I synchronize the page numbers after the table of contents with the page numbers of the main text, instead of the page numbers of the original document
jar version:words-24.3-jdk17.jar
code:
Document doc1 = new Document("1.doc");
Document doc2 = new Document("2.doc");
doc1.appendDocument(doc2, ImportFormatMode.KEEP_SOURCE_FORMATTING);
for (Field field : doc1.getRange().getFields())
{
if (field.getType() == FieldType.FIELD_TOC)
{
field.update();
}
}
doc1.updatePageLayout();
doc1.updateFields();
doc1.save("E:/TEST/Out.doc");
@serendipity.zhq
To merge two Word documents with a table of contents (TOC) and address your specific questions, you can follow these steps:
-
Merging the Table of Contents: After merging the documents, you can update the TOC fields in the main document to include entries from the second document. This is done by iterating through the fields in the merged document and updating any TOC fields.
-
Deleting the Original Table of Contents: If you want to remove the original TOC from the merged document, you can identify the TOC field and remove it from the document. This can be done by checking the field type and using the appropriate method to remove it.
-
Synchronizing Page Numbers: To ensure that the page numbers in the TOC reflect the correct page numbers of the main text, you can restart the page numbering in the second document before merging it. After merging, you will need to update the TOC fields again to reflect the new page numbers.
Here is an example code snippet that demonstrates how to achieve this:
Document doc1 = new Document("1.doc");
Document doc2 = new Document("2.doc");
// Restart page numbering in the second document
doc2.getFirstSection().getPageSetup().setRestartPageNumbering(true);
doc2.getFirstSection().getPageSetup().setPageStartingNumber(1);
// Merge the documents
doc1.appendDocument(doc2, ImportFormatMode.KEEP_SOURCE_FORMATTING);
// Update TOC fields
for (Field field : doc1.getRange().getFields()) {
if (field.getType() == FieldType.FIELD_TOC) {
field.update();
}
}
// Remove the original TOC if needed
for (Field field : doc1.getRange().getFields()) {
if (field.getType() == FieldType.FIELD_TOC) {
field.remove();
}
}
// Save the merged document
doc1.save("E:/TEST/Out.doc");
In this code:
- The page numbering is restarted in the second document before merging.
- The documents are merged while keeping the source formatting.
- The TOC fields are updated to reflect the new structure.
- The original TOC can be removed if necessary.
Make sure to adjust the logic for removing the TOC based on your specific requirements. This approach should help you effectively merge the documents while managing the TOC as needed.
If you have further questions or need additional assistance, feel free to ask!
The above code does not solve my 2nd and 3rd problems
@serendipity.zhq The is no need to merge TOC, if the same type of TOC are used in both documents, you can simply remove TOC from the second document and update TOC in the document after merging:
Document doc1 = new Document("C:\\Temp\\doc1.docx");
Document doc2 = new Document("C:\\Temp\\doc1.docx");
// Remove TOC from the second document.
for (Field f : doc2.getRange().getFields())
{
if (f.getType() == FieldType.FIELD_TOC)
f.remove();
}
doc1.appendDocument(doc2, ImportFormatMode.USE_DESTINATION_STYLES);
// update fields.
doc1.updateFields();
doc1.save("C:\\Temp\\out.docx");
If you still have problem, please zip and attach your source documents here for our reference. We will check them and provide you more information.
The above code can remove the directory of the second document, but after removal, the page where the directory is located is a blank page. How can I remove this blank page?
@serendipity.zhq Could you please attach your input, output and expected output documents here for testing? It is quite difficult to answer the question without your actual documents.
In the compressed file, “12.doc” is the original two documents, and “Out.doc” is the generated document
You can see in the “out.doc” document that the line width of the table of contents is compressed, and the page numbers are always looping from 1 to 14. The table of contents in the second merged document is removed, but there are blank pages.
@serendipity.zhq To continue numbering you should reset RestartPageNumbering
flag:
Document doc1 = new Document("C:\\Temp\\12.doc");
Document doc2 = new Document("C:\\Temp\\12.doc");
// Remove TOC from the second document.
for (Field f : doc2.getRange().getFields())
{
if (f.getType() == FieldType.FIELD_TOC)
f.remove();
}
// Reset numbering restarting.
for (Section s : doc2.getSections())
s.getPageSetup().setRestartPageNumbering(false);
// Merge documents.
doc1.appendDocument(doc2, ImportFormatMode.USE_DESTINATION_STYLES);
// update fields.
doc1.updateFields();
doc1.save("C:\\Temp\\out.doc");
Toe problem with TOC appearance is not a bug. If you update TOC in the source document you will see exactly the same. This is caused by incorrect tab stop defined in the TOC styles. You can correct this using the following code:
List<Integer> tocStyles = new ArrayList<Integer>();
tocStyles.add(StyleIdentifier.TOC_1);
tocStyles.add(StyleIdentifier.TOC_2);
tocStyles.add(StyleIdentifier.TOC_3);
tocStyles.add(StyleIdentifier.TOC_4);
tocStyles.add(StyleIdentifier.TOC_5);
tocStyles.add(StyleIdentifier.TOC_6);
tocStyles.add(StyleIdentifier.TOC_7);
tocStyles.add(StyleIdentifier.TOC_8);
tocStyles.add(StyleIdentifier.TOC_9);
PageSetup ps = doc1.getFirstSection().getPageSetup();
double tabStop = ps.getPageWidth() - ps.getRightMargin() - ps.getLeftMargin();
for (int style : tocStyles)
{
Style tocStyle = doc1.getStyles().getByStyleIdentifier(style);
tocStyle.getParagraphFormat().getTabStops().clear();
tocStyle.getParagraphFormat().getTabStops().add(tabStop, TabAlignment.RIGHT, TabLeader.DOTS);
}
So how to solve the blank page problem after the directory is removed?
@serendipity.zhq I do not see any empty pages in the output document. But you can use a built-in Document.removeBlankPages method to remove empty pages from the document.
Thanks for your help and have a nice day!
1 Like
Is there any other way to delete blank pages? I tried the code doc.getRange().replace("&m", "");
, but it didn’t work, and the jdk version I used doesn’t have the function Document.removeBlankPages
.
The second page is a blank page left after deleting the file directory.
@serendipity.zhq
Most likely you are using an old version of Aspose.Words. Document.removeBlankPages
method has been introduced in 24.5 version of Aspose.Words.
I do not see any empty pages in your document:
Here is your document rendered to PDF. As you can see there are no empty pages:
out.pdf (196.9 KB)
I know. When I use wps to open the document, a blank page will appear. If I use office to open the document, there will be no blank page.
@serendipity.zhq This looks like a peculiarity of WPS office. In your document there are sections with only one empty paragraph. You an try removing them. For example you can use the following code:
Document doc = new Document("C:\\Temp\\in.doc");
for (Section s : doc.getSections())
{
if (s.getBody().getChildNodes(NodeType.ANY, false).getCount() == 1)
{
Paragraph p = s.getBody().getFirstParagraph();
if (p.toString(SaveFormat.TEXT).trim().equals(""))
s.remove();
}
}
doc.save("C:\\temp\\out.doc");
out.zip (71.1 KB)
Thank you very much, your code solved my problem perfectly!
1 Like
When I was testing merging documents, I found that the content of the generated document header and footer disappeared, and the document watermark also disappeared. How can I keep it?
@serendipity.zhq Could you please attach your input and output documents here for testing? We will check them and provide you more information.
TEST.7z (181.5 KB)
Among them, test4.docx is the merged two files, and Out.docx is the generated file
for (Section section : doc1.getSections()) {
section.getPageSetup().setRestartPageNumbering(false);
section.getHeadersFooters().linkToPrevious(true);
}
for (Section section : doc.getSections()) {
HeaderFooter footer = section.getHeadersFooters().getByHeaderFooterType(HeaderFooterType.FOOTER_PRIMARY);
if (footer == null) {
footer = new HeaderFooter(doc, HeaderFooterType.FOOTER_PRIMARY);
section.getHeadersFooters().add(footer);
}
boolean hasPageFields = containsPageFields(footer);
if (!hasPageFields) {
Paragraph para = new Paragraph(doc);
footer.appendChild(para);
para.getParagraphFormat().setAlignment(ParagraphAlignment.CENTER);
para.appendChild(new Run(doc, "—"));
para.appendField(FieldType.FIELD_PAGE, false);
para.appendChild(new Run(doc, "—"));
}
}
I used the above code to regenerate the page numbers. If I don’t use the above code, is there any other way to preserve the header and footer content (including page numbers) of the original document?
@serendipity.zhq To keep the document’s headers/ footers, you can simply merge them without modifications:
Document doc1 = new Document("C:\\Temp\\test4.docx");
Document doc2 = new Document("C:\\Temp\\test4.docx");
doc1.appendDocument(doc2, ImportFormatMode.USE_DESTINATION_STYLES);
doc1.save("C:\\Temp\\out.docx");
Also, you can try using Merger
class:
Merger.merge("C:\\Temp\\out.docx", new String[] { "C:\\Temp\\test4.docx", "C:\\Temp\\test4.docx" }, SaveFormat.DOCX, MergeFormatMode.KEEP_SOURCE_LAYOUT);
out.docx (140.6 KB)