Hi Amin,
Thanks for your inquiry.
*aminzamani:
I was able to implement the code as you described. But it is not always working. I have attached an input file. The page break is not in the new document. But when i use your input files that you have attached before (the 2 ones) it works. Maybe best is when you attach your class that you have coded “FindandSplitDocument.java” so I can be 100% sure that I have that one you used. Please also check if it works for you with the attached input file “test1-input.doc”. The page break in my case is not inside the second output file.*
Yes, in case of test1-input.doc, the code does not insert the section break in output documents. I missed to add following lines of code for last ‘finish’ word.
. . .
. . .
BookmarkStart bStart = ((Bookmark)bookmarks.get(bookmarks.size() - 1)).getBookmarkStart();
ArrayList nodes = extractContent2(bStart, doc.getLastSection().getBody().getLastParagraph(), true);
Document newdoc = generateDocument2(doc, nodes);
if (newdoc.getFirstSection().getBody().getFirstParagraph() != null &&
newdoc.getFirstSection().getBody().getFirstParagraph().toString(SaveFormat.TEXT).trim().equals(searchKeyWord)
&& bStart.getBookmark().getName().startsWith("BM_S"))
{
DocumentBuilder newbuilder = new DocumentBuilder(newdoc);
newbuilder.moveTo(newdoc.getFirstSection().getBody().getFirstParagraph());
newbuilder.insertBreak(BreakType.SECTION_BREAK_CONTINUOUS);
newbuilder.getCurrentParagraph().remove();
}
newdoc.getRange().getBookmarks().clear();
newdoc.save(MyDir + "Out_" + bookmarks.size() + ".docx");
*aminzamani:
by the way: Why do we only set a bookmark if in the input file is a page break? The answer seems to be, because then the page break is not in the new generated document when not setting it. But everything should be in the generated document which is between the start and end of the extracted content. It seems for me possible that there could be some more data than only a page break that we add manually.*
The bookmarks are inserted for all finish words. If ‘finish’ paragraph contains the section break, the inserted bookmark has name started with ‘BM_S’. Please check following lines of codes.
if (((Run)runs.get(0)).getParentParagraph().getText().contains(ControlChar.SECTION_BREAK))
{
builder.startBookmark("BM_S" + i);
builder.endBookmark("BM_S" + i);
}
else
{
builder.startBookmark("BM_" + i);
builder.endBookmark("BM_" + i);
}
*aminzamani:
Do we have add other elements manually into the generated elements, too? Is there no way to easily copy everything from the start till to the end of the document? Also an other problem : We use aspose word because we want to split the documents as described inside this ticket. This documents will be given to someone by an workflow in an enterprise content management system. Many scientists will get the splitted documents and modify them. It is really a very important project and so very important that everything is inside the splitted document which was between the extracted content in the source document. Then when the people have edited the splitted documents we will merge it back to one document. Therefore it is not acceptable when some parts are not there.*
You can insert images, text, bookmark, tables etc in generated document. You can achieve your requirement what you need. I suggest you please read the following documentation links for your kind reference.
https://docs.aspose.com/words/java/aspose-words-document-object-model/
https://docs.aspose.com/words/java/logical-levels-of-nodes-in-a-document/
Please check the code at following documentation links.
https://docs.aspose.com/words/java/find-and-replace/
https://docs.aspose.com/words/java/extract-selected-content-between-nodes/
*aminzamani:
The splitting and merging must be finish as soon as possible because in 8 days we have to show it to our customer. I have to ensure that the splitting is working with everything which is in the source document between the search words and thus also copied to the splitted documents. The last step is merging the splitted documents back to one document.*
Please use the Document.appendDocument method to append the specified document to the end of this document. I suggest you please read following documentation link.
https://docs.aspose.com/words/java/insert-and-append-documents/
*aminzamani:
Could you ensure / confirm that everything is inside the splitted document? As said for page breaks we put them manually into the splitted documents. Are there other elements which must be added manually to the splitted documents, like:*
The extractContent method works perfectly. However, extractContent does not extract the section breaks. That is the reason section break is added separately after extracting the contents.
Please note that FindAndInsertBookmark and extractContent works fine. Regarding FindReplaceTest method, it seems that all of your scenarios are covered in shared code. However, you need to modify code according to your requirement.