Calculating page breaks

Hi.

We want our system to automatically generate and send large amount of documents without human intervention.

However, determining where page breaks should be is quite a headache.

Our reports consist of report modules called “building blocks”. Currently, we try to anticipate where page breaks should be by starting large building blocks off with a page break. Although this works in preventing tables from starting near the end of a page, we still have the problem of unpredictable white space. Also, when a large building block follows another building block that just happens to end on the very last line of the previous page, the line break causes a blank page to be inserted in the document.

To overcome this problem, I’m contemplating the following solution: Each building block will contain a method telling the report engine the minimum amount of space (in points) it needs before generating, and how much space it used afterwards. The Engine will keep track of how many points is used, and how many points is available on the page. Based on this information, the engine can then make informed decisions on when to create the page breaks. The problem with this solution is that every building block will have to calculate how much space it is using. This is particularly problematic in tables, as it would be very difficult to predict when text will jump to the next line. Apart from the fact that this solution will create massive overhead in that I’d have to try and calculate the height of every single element in the building block, it will also suffer from not being 100% accurate. The solution is nice in theory, but I suspect it would be a maintenance nightmare in practice.

I’m hoping whether you guys would know of a better solution - one that might already be build into ASPOSE, or one that would work well with ASPOSE. I am aware of KeepTogether and KeepWithNext, but these functions are very limited and does not always work (For instance, you can’t keep an image of a graph with a table)

Your help is greatly appreciated.

Hi Hanno,

Thanks for your inquiry. Word document is flow document and does not contain any information about its layout into lines and pages. Therefore, technically there is no “Page” concept in Word document. Pages are created by Microsoft Word on the fly.
Aspose.Words uses our own Rendering Engine to layout documents into pages. Please check using the DocumentLayoutHelper sample from the offline samples pack. This sample demonstrates how to easily work with the layout elements of a document and access the pages, lines, spans etc.

If you want to extract each page from a document individually and export to DOCX. Please try the PageSplitter sample. This will help you better as it was designed to achieve this task.

Moreover, you can get the page number of a Node by using LayoutCollector.GetStartPageIndex method. Please read about Aspose.Words.Layout namespace from here:
https://reference.aspose.com/words/net/aspose.words.layout/

Hope this answers your query. Please let us know if you have any more queries.

Thanks. I’ll look into it.

Hi.

I’ve tried running the DocumentLayoutHelper project, but it seems to be incomplete: Definitions for following is missing:

LayoutEntityType
LayoutEnumerator
LayoutCollector
(See attached)

Also, in our own project, Words.Layout is the only namespace that I can’t pick up with intelisence. We’re using Words 11.6. I noticed the DocumentLayoutHelper project use Words 11.7, but even there, when I comment out all the code in LayoutEntities.cs and RenderedDocument.cs, I cannot pick up Aspose.Words.Layout.

Am I missing something? I can’t figure this out without a working build.

Hi Hanno,

Thanks for your inquiry. Please use the latest version of Aspose.Words for .NET 13.3.0. The latest version of Aspose.Words contains the Aspose.Words.Layout namespace.

Please read about Aspose.Words.Layout namespace from here:
https://reference.aspose.com/words/net/aspose.words.layout/