Aspose Lazy PDF conversion [Java]

Page Limit setting:
Page limitation setting for PDF Conversion is used to overcome High CPU Utilization issue by using IPageLayoutCallback and IPageSavingCallback interface.

Using Aspose library [asposes.slides, aspose.cells, aspose.words], we have implemented the page limit settings.

Our client have a request to load next 10 pages for preview while previewing First 10 pages as set as Page limit.

While previewing 1-10 pages of PDF, the next 11-20 pages are in ready to load and preview for the document having more number of pages and sizes.

Share your thoughts about implementing Aspose Packages or methods that support the Lazy load operation using Page limit.

@dev.raz Unfortunately, Aspose.Words does not have such ability. I have logged a feature request WORDSNET-24743. We will consider providing such feature in on of our future releases. We will be sure to keep you posted and let you know once it is implemented or we have more information for you.

WORDSNET-24743

We expected this feature request will be released as soon as possible for Java Application.

For slides-java & cells-java, Is there any feature available to support the Lazy load operation using Page limit for Java Application.

@dev.raz,

For Aspose.Cells, the API does not support this feature. We have logged a feature request “CELLSJAVA-45040” into our database. We will look into it soon. We will keep you posted with updates (once available) on it.

@dev.raz,
Regarding Aspose.Slides, if I understand correctly, you would like to have settings to manage the number of slides to export from a PowerPoint presentation to PDF, then you need to be notified that the first batch of slides has been converted and be able to control whether export the next batch of slides to the PDF document, etc. Could you please confirm?

@Andrey_Potapov,
Yes, your understanding is similar to our proposal.
Could you share more details to implement this feature?

@dev.raz,
I’ve added a ticket with ID SLIDESJAVA-39075 to our issue-tracking system. Our development team will consider implementing such a feature. You will be notified when a new release of Aspose.Slides with the feature is published.

@dev.raz The feature request has been analyzed by Aspose.Words development team and it was concluded that the required functionality can be achieved using IPageLayoutCallback. For example see the following code:

private class PreviewCallback : IPageLayoutCallback
{
    private readonly HashSet<int> _ReadyPages = new HashSet<int>();
    private readonly Action<PageRange, Stream> _ShowPage;
    private readonly int _BatchSize;
    private Document _Document;

    public PreviewCallback(int batchSize, Action<PageRange, Stream> showPage)
    {
        _ShowPage = showPage;
        _BatchSize = batchSize;
    }

    public void Notify(PageLayoutCallbackArgs args)
    {
        // This is simplified logic, some scenarios may not trigger callback for updated pages.
        // If a document has 7 pages, batch size is 5 and the following rendering order happened:
        // 1, 2, 3, 4, 5, 6, 7, 2, 3, 4, 6 then show callback is called for (1-5) and (5-7).
        // It is not called for (2-4) and (6-6) after these were updated.

        _Document = args.Document;

        switch (args.Event)
        {
            case PageLayoutEvent.BuildFinished:
                if (_ReadyPages.Count > 0)
                {
                    var i = (_ReadyPages.Max() / _BatchSize) * _BatchSize;
                    FlushPages(new PageRange(i, _ReadyPages.Where(v => v < i + _BatchSize).Max()));
                }
                break;

            case PageLayoutEvent.PartReflowFinished:
                {
                    _ReadyPages.Add(args.PageIndex);

                    if (_ReadyPages.Count >= _BatchSize)
                    {
                        var from = (_ReadyPages.Min() / _BatchSize) * _BatchSize;
                        var to = _ReadyPages.Max();
                        for (var i = from; i <= to; i += _BatchSize)
                            if (Enumerable.Range(i, _BatchSize).All(v => _ReadyPages.Contains(v)))
                                FlushPages(new PageRange(i, _ReadyPages.Where(v => v < i + _BatchSize).Max()));
                    }

                    break;
                }
        }
    }

    void FlushPages(PageRange pageRange)
    {
        // Save current page range into pdf stream. It can be PNG or HTML or any other supported output format.
        // NOTE It is important to disable field update, otherwise infinite loop may form for certain documents.
        // Field update has already happened or will happen as specified when document was saved originally.
        var pdfStream = new MemoryStream();

        _Document.Save(pdfStream, new PdfSaveOptions { PageSet = new PageSet(pageRange), UpdateFields = false });
        pdfStream.Position = 0;

        _ShowPage(pageRange, pdfStream);
        for (var i = pageRange.From; i < pageRange.From + _BatchSize; i++)
            _ReadyPages.Remove(i);
    }
}

static void Preview(string fileName, int batchSize)
{
    // This technique is only suitable for preview purposes, it is advisable to do
    // a proper full uninterrupted save operation on the document to get correct result.

    var doc = new Document(fileName);
    doc.LayoutOptions.Callback = new PreviewCallback(batchSize, (pageRange, pdfStream) =>
    {
        // Preview code does whatever necessary to render the pdf document, i.e. serve web response.
        // In production it should queue this work on a pool thread to not block layout code.
        // NOTE Each page can be rendered multiple times and each time result can be different. This may also happen out of order.
        // Here tickcount is added to keep all outputs separate and not overwrite.
        using (var pageStream = new FileStream($@"pdf-pages-{pageRange.From + 1}-{pageRange.To + 1} ({Environment.TickCount}).pdf", FileMode.Create))
            pdfStream.CopyTo(pageStream);
    });

    // This builds page layout and updates fields. The output is discarded and format does not matter so long it is fixed page format.
    doc.Save(new MemoryStream(), SaveFormat.Xps);
}

@dev.raz,
Using Aspose.Slides, our developers have suggested the following approach for you:

Presentation pres = new Presentation(path + "testPres.pptx");

PdfOptions pdfOptions = new PdfOptions();
pdfOptions.setSaveMetafilesAsPng(false);

int delimiter = 4;
int slideIndex = 1;
int slidesCount = pres.getSlides().size();
pres.dispose();

while (slidesCount > 0)
{
    int[] slides = new int[delimiter];
    if (slidesCount < delimiter)
        slides = new int[slidesCount];

    slidesCount -= delimiter;

    for (int j = 0; j < slides.length; j++)
    {
        slides[j] = slideIndex++;
    }

    Presentation innerPres = new Presentation(path + "testPres.pptx");
    innerPres.save(path + "output "+ slides[0] + "-" + slides[slides.length - 1] + ".pdf", slides, SaveFormat.Pdf, pdfOptions);
    innerPres.dispose();
}

We will be waiting for your feedback.

@dev.raz

For Aspose.Cells, please try the following code:

Workbook wb = new Workbook("BookTest.xlsx");

//get total page count
WorkbookPrintingPreview workbookPrintingPreview = new WorkbookPrintingPreview(wb, new ImageOrPrintOptions());
int totalPageCount = workbookPrintingPreview.getEvaluatedPageCount();

int stepSize = 10;
int pageIndex = 0;
while(pageIndex < totalPageCount)
{
    PdfSaveOptions pdfSaveOptions = new PdfSaveOptions();
    pdfSaveOptions.setPageIndex(pageIndex);
    pdfSaveOptions.setPageCount(stepSize);
    
    wb.save("Output_" + (pageIndex+1) + "-" + Math.min(totalPageCount, pageIndex+stepSize) + ".pdf", pdfSaveOptions);
    
    pageIndex += stepSize;
}

The issues you have found earlier (filed as WORDSNET-24743) have been fixed in this Aspose.Words for Java 23.2 update.