Very high RAM usage

rd1218 · August 24, 2023, 4:14am

I’ve recently started trying Aspose.PDF (.Net 4.8) and noticed very high RAM consumption. Without Aspose.PDF the system needs ~20% of RAM. But when Aspose.PDF starts (noticeable with larger files, ~20MB) it drains all available RAM, reaches 97-99% readings and causes server response issues.

I didn’t found a minimum RAM requirement for this product.

Is there a way to limit Aspose.PDF RAM usage?

image.png (21.0 KB)

asad.ali · August 24, 2023, 1:08pm

@rd1218

The RAM or memory usage depends upon the operations and document complexity you are using with the API. Can you please share the sample document and sample code snippet so that we can try to reproduce the issue in our environment and address it accordingly.

rd1218 · August 24, 2023, 2:38pm

@asad.ali
I’m wondering that this should have a better approach, like some sort of limitation on the RAM usage or even disk cache usage. Ideally, it would be a parameter sent to each request or a general setup.

While the Aspose.PDF code does seem to generate good output XLS files, this lack of resource limit usage control is dangerous because eventually will cause server time-out.

Regarding your specific request for a sample file, I think this is not a good approach. Its well known that poorly generated PDF files does exist, therefore hardware resources usage limit should always be a concern.

asad.ali · August 24, 2023, 10:04pm

@rd1218

We requested for the sample file due to the fact that PDF documents can have variety of structures. It is quite possible that we do not get to test with PDF documents having similar structure and complexity. Due to which we prefer performing investigation with the same type of PDF documents with which the issue has been observed. If it is possible for you, please share a sample PDF document along with the details of memory consumption you noticed at your end. We will log an investigation ticket in our issue tracking system and work on it to improve API performance.

rd1218 · August 24, 2023, 10:59pm

Ok, I trully understand the need to check every input for the sole purpose of code improvement.

But in this specific case I keep my idea that a RAM limit should exist, otherwise we would need to apply artifitial limitations (like filesize limit).

Anyways, here is a PDF that had problem:

File “input_sample_1” caused internal server error – probably due to RAM shortage – in a 8GB DDR4 server;
File “input_sample_2” was obtained by simpling opening the “input_sample_1” file in Adobe and “print to PDF”, and then completed succesfully (with very high RAM usage)

Files (2 weeks available):

asad.ali · August 25, 2023, 12:22am

@rd1218

We are checking it and will get back to you shortly.

rd1218 · August 25, 2023, 12:50am

@asad.ali
Thanks. Also, please consider the code below, and that I’m using Aspose.Net 23.8.0.

var inputFile = Request.Files[i];

// Load PDF with an instance of Document
var document = new Document(inputFile.InputStream);

var extension = Path.GetExtension(inputFile.FileName);
var filename = inputFile.FileName.Replace(extension, "") + "_" + dataAtual.ToString("yyyy.MM.dd-HH.mm.ss") + ".xlsx";

var dataSave= new ExcelSaveOptions { MinimizeTheNumberOfWorksheets = true };

var filepath = ConfigurationManager.AppSettings["outFolder"].ToString() + filename;

// Save document in XLS format
document.Save(filepath, dataSave);

asad.ali · August 25, 2023, 12:48pm

@rd1218

Sure, we will test using this code snippet and get back to you with our feedback.

asad.ali · September 7, 2023, 4:03pm

@rd1218

Looks like the shared files were removed before two weeks. Would you kindly upload them again so that we can proceed with our investigation? We are sorry for the trouble.

rd1218 · September 7, 2023, 5:20pm

@asad.ali have you downloaded it previously?
It completed 14 days and was removed.
I’m afraid I don’t have those files anymore.

asad.ali · September 8, 2023, 12:23am

@rd1218

Luckily, we were able to locate those downloaded files in our system. We are checking it and will get back to you shortly.

rd1218 · September 8, 2023, 12:44am

@asad.ali
Ok! Thank you.

rd1218 · November 19, 2023, 6:34pm

Hello.
Regarding this subject, any updates?

asad.ali · November 19, 2023, 10:38pm

@rd1218

We were able to notice some high RAM usages in our environment with older version(s) of the API. However, we have just released 23.11 version of the API with more improvements. We will be checking the scenario using this version as well and sharing our feedback with you.

rd1218 · November 19, 2023, 11:45pm

Ok. Please give me a return on this.
I went on vacation and afterwards I had some other issues, now I’m going back to this matter.
If the issue persists, please inform what are your thoughts on how to address this.

asad.ali · November 20, 2023, 1:20pm

@rd1218

Sure. We will soon try to complete the initial investigation and let you know about our feedback.

asad.ali · December 13, 2023, 8:46pm

@rd1218

Thanks for your patience and bearing with us. We have performed initial investigation in our environment using both 23.11 and 23.11.1 (a hotfix released later) versions of the API. We were able to notice the high memory consumption in our environment.

Environment

Windows 11 64-bit Pro
16G RAM
Core i7

We also tried to optimize the PDF document (input_sample1.pdf) using Document.Optimize() method to check if enabling Fast Web View could help but we did not succeed in improving the memory consumption and performance of the API. A quality ticket as PDFNET-56150 has been logged in our issue tracking system to further analyze this case in details.

We will certainly work on this issue and will improve the API performance for this type of PDF documents. Please note that we have also tested the case with other large size PDFs but behavior of the API was not same in those cases. It appears that the issue may be related with the specific type of these PDFs. Nevertheless, we will inform you as soon as the ticket is resolved. We apologize for the inconvenience.

rd1218 · April 9, 2024, 8:21pm

@asad.ali
Hello,
Should I expect any update on this issue?
Or should this be closed?

asad.ali · April 9, 2024, 10:57pm

@rd1218

Regretfully, the ticket has not been yet resolved due to other pending issues in the queue logged prior to it. However, we have recorded your concerns and will surely inform you as soon as we make some significant progress in this regard. Please spare us some time.

We are sorry for the inconvenience.

rd1218 · November 6, 2024, 3:52pm

@asad.ali
Long time has passed
Should I expect any update on this issue, or should this be closed?