Word doc with thousands of pages takes a long time and GB of memory to convert to PDF with PdfSaveOptions.PageCount set to 1

We are using Aspose.Words 11.6.1.

When converting a documents containing thousands of pages to PDF (I have attached an example), even though we limit the output to 1 page using PdfSaveOptions.PageCount, the conversion still uses gigabytes of memory and takes a long time. I don’t know how long exactly, because I killed it after 30 minutes.

The code is something like this:

var doc = new Document(sourceFilePath);
Aspose.Words.Saving.PdfSaveOptions pdfSaveOptions = new Aspose.Words.Saving.PdfSaveOptions();
pdfSaveOptions.PageCount = 1;
doc.Save(destinationFilePath, pdfSaveOptions);

A related issue:

You can’t set PdfSaveOptions.PageCount higher than Document.PageCount or else you get an exception when you call Save(…).

However, checking the value of Document.PageCount for the example document attached above also uses GBs of memory and takes a long time.

int maxPages = 100;
var doc = new Document(sourceFilePath);
var pdfSaveOptions = new Aspose.Words.Saving.PdfSaveOptions();

if (maxPages <doc.PageCount)
{
    pdfSaveOptions.PageCount = maxPages;
}

Hi
Reuben,

Thanks for your inquiry and sorry for the delayed response. We’re checking with this scenario and will get back to you soon.

Best Regards,

Hi,

Thanks for your patience.

While using the latest version of Aspose.Words i.e. 11.6.0, I was unable to even render your document to PDF format because an OutOfMemoryException was thrown on my side. I have logged this performance issue in our bug tracking system. The issue ID is WORDSNET-6833. Your request has also been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

P.S: Please note that usually Aspose.Words needs few times more memory than document size to build model of the document in memory. For example if your document’s size is 1 MB, Aspose.Words needs 10-20 MB of RAM to build its DOM in memory. Multiplier depends on format because some formats are more compact than others. For example DOCX format is more compact than DOC and RTF, and DOC is more compact than RTF.
Best Regards,

The issues you have found earlier (filed as WORDSNET-6833) have been fixed in this Aspose.Words for .NET 17.2.0 update and this Aspose.Words for Java 17.2.0 update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Hi there,

We have introduced an option (SaveOptions.MemoryOptimization) to optimize memory consumption during these scenarios. When its value is set to true it will improve document memory footprint but will add extra time to processing. This optimization is only applied during save operation.

Please use the following code example.

Document doc = new Document(MyDir + "in.docx");

PdfSaveOptions options = new PdfSaveOptions();
options.MemoryOptimization = true;

doc.Save(MyDir + "Out v17.2.0.pdf", options);