Render pages too slow

Using Aspose.Words 15.1

We regularly work with large documents 500 pages +, ideally we want a way to split these pages sequentially page by page so we can display the first pages while it is busy loading the rest of the document.

I have looked at the examples and the DocumentPageSplitter, PageNumberFinder and SectionSplitter is is taking 12 minutes plus to split an 800 page document so this is not acceptable to ask the user to wait for.

so far the quickest way is render each page using Document.GetPageInfo and then using Document.RenderToScale functions the problem is the first page is still taking 1 minute 20 seconds on an i7 machine with an SSD, to render, is there any way to do this faster?

Plus is there a quick way to then get all the nodes that were used to render this page it would be nice if Document.RenderToScale returned an enumerable of the nodes it used to render then you could keep them in memory do highlighting ect then render them again in a new document.

var nodes = Document.RenderToScale(pageNumber - 1, gr, 0, 0, MyScale);

here is the page rendering code:

var sw = new Stopwatch();
sw.Start();
PageInfo pageInfo = Data.Document.GetPageInfo(pageNumber - 1);
sw.Stop();
Trace.Write(string.Format("@@ got page {0} information in {1}ms", pageNumber, sw.Elapsed.TotalMilliseconds));

// Let's say we want the image at 50% zoom.
float MyScale = lowResolution ? 0.50f : 2.0f;

// Let's say we want the image at this resolution.
float MyResolution = lowResolution ? 150.0f : 250.0f;

var pageSize = pageInfo.GetSizeInPixels(MyScale, MyResolution);

using (var docImageStream = new MemoryStream())
using (Bitmap img = new Bitmap(pageSize.Width, pageSize.Height))
{
    img.SetResolution(MyResolution, MyResolution);

    using (Graphics gr = Graphics.FromImage(img))
    {
        // You can apply various settings to the Graphics object.
        gr.TextRenderingHint = TextRenderingHint.AntiAliasGridFit;

        // Fill the page background.
        gr.FillRectangle(System.Drawing.Brushes.White, 0, 0, pageSize.Width, pageSize.Height);

        // Render the page using the zoom.
        sw.Reset();
        sw.Start();
        Data.Document.RenderToScale(pageNumber - 1, gr, 0, 0, MyScale);
        sw.Stop();
        Trace.Write(string.Format("@@ rendered page {0} in {1}ms", pageNumber, sw.Elapsed.TotalMilliseconds));
    }
}

Hi Karrim,

Thanks for your inquiry. Please note that performance and memory usage all depend on complexity
and size of the documents you are generating. While rendering a document to fixed page formats (e.g. PDF), Aspose.Words needs to build two model in the memory – one for document and the other for rendered document.

The process of building layout model is
not linear; it may take a minute to render one page and may take a few
seconds to render 100 pages. Also, Aspose.Words has to create APS (Aspose Page Specification)
model in memory and this may again eat some more time for some
documents. Rest assured, we’re always working on improving performance;
but, rendering will be always running slower than simple saving to flow
formats (e.g doc/docx).

Could you please attach your input Word document here for testing? I will investigate the issue on my side and provide you more information.

you can download our test document from my public one drive
https://onedrive.live.com/redir?resid=5F154148DE0C0DA9!119

Hi Karrim,

Thanks for sharing the document. I have tested the scenario and have found that the Document.PageCount takes around 1.25 minutes. This is not an issue. If you call Document.PageCount or Document.UpdatePageLayout after loading the document, Document.GetPageInfo will not take much time. Document.UpdatePageLayout method rebuilds the page layout of the document.

Aspose.Words builds the page layout of the document when you try to convert document to fixed file. This is the reason that first page is taking 1 minute 20 seconds.

While rendering a document to fixed page formats (e.g. PDF, Image file format), Aspose.Words needs to build two model in the memory – one for document and the other for rendered document.

Hope this answers your query. Please let us know if you have any more queries.