Saving PDF Doc to MemoryStream takes very long (approx. 15 minutes) with large PDF files

Saving PDF Doc to MemoryStream takes very long (approx. 15 minutes) with large PDF files.
We are actually testing with files having a size of approx. 300MB.

Below the associated code:

Dim pdfDocument As New Aspose.Pdf.Document(_pdfDocumentFilePath)
Using pptStream As New MemoryStream()
pdfDocument.Save(pptStream)
_dataProvider.PdfFileContent = pptStream
End Using

.Net Framework 4.7.2.
Aspose.Pdf V.24.3.0.0,
(With Aspose.Pdf V.7.0.0.0 - causes out of memory issue)

Thanks already for your help.

@clintrak_it

When saving large PDF files to a MemoryStream, performance issues can arise, especially with files around 300MB in size. Here are some suggestions to help improve the performance of your code:

  1. Optimize Memory Usage: Ensure that your application has sufficient memory allocated. Large PDF files can consume a significant amount of memory, and if your application runs out of memory, it may lead to performance degradation or exceptions.

  2. Use Buffered Streams: Instead of directly using a MemoryStream, consider using a buffered stream. This can help manage memory more efficiently when dealing with large files.

  3. Check for Updates: You are using Aspose.Pdf version 24.3.0.0. Check if there are any newer versions available, as performance improvements and bug fixes are often included in updates.

  4. Profile Your Code: Use profiling tools to identify bottlenecks in your code. This can help you understand where the time is being spent during the save operation.

  5. Consider File Size: If possible, try to work with smaller PDF files or split large files into smaller chunks before processing them. This can significantly reduce the time taken to save to a MemoryStream.

Here is a modified version of your code that incorporates a buffered stream:

Dim pdfDocument As New Aspose.Pdf.Document(_pdfDocumentFilePath)
Using pptStream As New BufferedStream(New MemoryStream())
    pdfDocument.Save(pptStream)
    _dataProvider.PdfFileContent = pptStream
End Using

Using a BufferedStream can help improve the performance when saving large files by reducing the number of I/O operations.

If the problem persists, consider reaching out to Aspose support for more tailored assistance regarding performance with large PDF files.

Sources:

Hi Aspose team
Unfortunately, the advice mentioned above has not worked.
What I have noticed is that saving a 300MB PDF document in a memory stream uses over 6 GB of application memory.
Why does saving the memory stream take up so much space?
Is there a memory leak here?
I tested with the latest possible Aspose.PDF Version 24.10.0.0.

Thanks already for your valued help.

@clintrak_it

Could you please provide a sample PDF document for our reference so that we can test the scenario in our environment to observe the memory consumption and address it accordingly? You can upload it to Google Drive or Dropbox and share the link with us.

Thank you for the quick response.
For copyright reasons, I cannot send you a file at the moment.
We are trying to find out how we can limit the file size to around 85MB.
Unfortunately, our application runs on .Net Framework 4.7.2 and has an x86 target CPU.
We cannot change this so easily because our application has a lot of dependencies on x86 dlls.

@clintrak_it

Please take your time to gather the file and other information requested by us. We will further proceed to assist you accordingly. Also, you can send us a private message to share the file. This way it will stay between you and Aspose Staff only. We use the data only for testing purposes and erase it from our system once the issue is resolved. Please click on username and press Blue Message button to send us private message.