We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Splitting document not fully respechting Page breaks of original file

Im using a fresh download of Aspose.Words for JAva 17.9 with an extended evaluation licence and the Pagesplitter example from here: https://reference.aspose.com/words/java/com.aspose.words/Document#extractPages(int,int)
In some not so rare cases (so far 2 out of 4) the extracted pages do not have the exact content of the original document. In our use case (online selection of pages to be proof read) that is disasterous behaviour.
I’ll upload one of the to failing docs. The second is a customer’s master thesis… Damn, docx not accepted for upload! How do I upload that?

Thanks for your inquiry. Please share your input, output and expected documents as ZIP file. We will look into these and will guide you accordingly. If the resource file size is big, then you can share it via some free file sharing service e.g. Dropbox, Google drive etc.

aspose_test.zip (266.1 KB)
This is one of the (many) failing documents. E.g. Page 5 is split incorrect. Also note that some of the “page” documents are spread over 2 pages.
Note: I’ve change line 311 the PAgeSplitter example to avoid the String comparison using == . This changed the errors made by the programm, but did neither create nor solve the issue.

Sorry, new line 311 is :

else if ((prevParagraph.getParagraphFormat().getStyleName() == null ? paragraph.getParagraphFormat().getStyleName() == null : prevParagraph.getParagraphFormat().getStyleName().equals(paragraph.getParagraphFormat().getStyleName())) && paragraph.getParagraphFormat().getNoSpaceBetweenParagraphsOfSameStyle())

Thanks for sharing your source document. We have tested the scenario and noticed the reported issue. We have logged following tickets in our issue tracking system for further investigation and rectification. We will keep you updated about the issue resolution progress.
WORDSNET-15873: Incorrect split of a page
WORDSNET-15874: one page splits over two pages