We have been using Aspose words/pdf for Java (licensed Aspose total). We have recently come to the observation that aspose libraries might be leaving open file descriptors for our java process, and these FDs remain open even when the file is deleted with “(deleted)” under /proc//fd .
The only way to close/ terminate such open FDs is to restart the java process.
Following is a code example of our implementation of creating a pdf from a base 64 content.
Unfortunately, we are unable to execute your code. Please create a sample Java application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.
Please share the complete steps how are you checking these values.
I will provide a simple java application for you debugging with simple pdf creation via base64 content, mean while
We are observing open descriptors via ls -ltr on the JVM process id location in /proc/PID/fd folder.
Normally FDs here open and close, but documents created through aspose have their descriptors visible here untill we restart the JVM. Ideally they should not appear in /proc/PID/fd location after the program has completed and the close() has been called within the code as done and shared below
Please download the sample project with the implementation of the code shared before. The code should leave open file descriptors as observed on our Linux RedHat environment.
We have logged an investigation ticket as PDFJAVA-41456 in our issue tracking system. You will be notified via this forum thread once this issue is resolved.
Thanks for your response. We will be thankful if we get response earlier as we need to restart production application server daily basis putting bad impact for business and also need to involve our infra teams.
Unfortunately, there is no update available on this issue at the moment. We will inform you via this forum thread once there is any news available on it.
Please expedite a fix for this as soon as possible. Its been nearly 3 months since the issue was reported and open descriptors really choke the system once they cross a certain threshold against the ulimit parameter.
It is to inform you that the issue which you are facing is actually not a bug in Aspose.PDF. So, we have closed this issue (PDFJAVA-41456) as ‘Not a Bug’.
We investigated your code in different environments (macOS, Ubuntu and Red Hat) and found some differences.
On the Red Hat OS (Red Hat Enterprise Linux release 8.4), remains open file descriptors, in our case that was only fonts. But this does not apply to the Aspose.PDF library for the reason that they are left open by Java itself. Aspose.PDF library only invokes the method for getting the list of available fonts and cannot close these font files.
For example, you can try to run following code snippet to got the open FDs:
For Long we believed that closing streams was the issue but as you suggested its something different altogether.
Please specify a workaround for remedy of this FD issue or a fix, be it Aspose java or fonts for that matter. Since last 2~3 months we have been regularly restarting JVM to keep FD count within check, but this too is getting tiresome for the teams as the load increases on PDF generation service using Aspose.
Waiting for a prompt response for resolution of this long running thread.
We investigated this issue in detail and noticed that it is not related to Aspose.PDF. Could you please share following detail along with complete steps that you are following to reproduce the same issue at our end?
Java version
Linux Red Het version
Screenshots with names of opened FDs
Other details represent the issue in our environment
We have observed similar behavior, in our application … our differences are;
we use Aspose in a AWS Lambda (serverless) architecture and so there is no server we control.
the JVM is launched and terminated by AWS.
We are using Aspose, HTML to Image (jpg)
It happens under heavy load (rapidly arriving requests).
Our application, responds to events on a messaging topic and creates a jpeg.
For performance AWS Lambda, will re-use the Lambda function (called warm start) as opposed to starting afresh (cold start)
So when there are a lot of orders in a rapid sequence, a single invocation is re-used to serve multiple requests.
In each we have the same way
Create HtmlDocument object of Aspose and followed by Aspose Convertor.convertHTML()
… which presumably leaks some file handles, which accumulate to starve the OS of its resources.
So a time comes when AWS Lambda, terminates the warm instance and starts a fresh one (cold start) and operation proceeds smoothly for some more time until the above sequence repeats.
Would you please confirm if you are using the latest version of the API? Also, please share some more steps in order to replicate the issue in our environment so that we can try to replicate it and address it accordingly.
Hi,
I upgraded to
groupId> com.aspose – artifactID> aspose-html – version> 22.9-jdk1.8
And could reproduce the issue.
However, I feel it would be difficult for you to reproduce the issue are your end - as I feel it is very much due to the resources available on AWS Lambda environment.
Code wise, my lambda function, receives a Kafka trigger giving the location of a S3 bucket which have some artifacts. Subsequently the artifacts (XML etc) are parsed and an HTML is developed as a String and then using;
public String createInfotechImage(PictureData picData) throws IOException {
LOG.info("The template has been read and tokens replaced.");
File htmlFile = File.createTempFile("infotech", ".html");
OutputStream os = Files.newOutputStream(htmlFile.toPath());
os.write(html.getBytes());
os.close();
String pathToJPG = htmlFile.getAbsolutePath();
Url urlToPage = new Url(pathToJPG);
String imageFile = File.createTempFile("myPictur", ".jpg").getAbsolutePath();
// Initialize an HTML document from the html file
com.aspose.html.HTMLDocument document = new com.aspose.html.HTMLDocument(urlToPage);
try {
com.aspose.html.saving.ImageSaveOptions options = new com.aspose.html.saving.ImageSaveOptions(com.aspose.html.rendering.image.ImageFormat.Jpeg);
com.aspose.html.rendering.PageSetup pageSetup = new com.aspose.html.rendering.PageSetup();
com.aspose.html.drawing.Page anyPage = new com.aspose.html.drawing.Page();
options.setHorizontalResolution(Resolution.fromDotsPerInch(300));
options.setVerticalResolution(Resolution.fromDotsPerInch(300));
options.setBackgroundColor(com.aspose.html.drawing.Color.fromName(picData.getBkColor()));
anyPage.setSize(new com.aspose.html.drawing.Size(
com.aspose.html.drawing.Length.fromPixels(picData.getWidth()),
com.aspose.html.drawing.Length.fromPixels(picData.getHeight())));
pageSetup.setAnyPage(anyPage);
options.setPageSetup(pageSetup);
// Convert HTML to PNG
com.aspose.html.converters.Converter.convertHTML(document, options, imageFile);
} finally {
if (document != null) {
document.dispose();
}
}
LOG.debug("Image File {}", imageFile);
File nf = new File(imageFile);
String im = nf.getPath();
LOG.info("The Infotech Image File has been created. {}", im);
htmlFile.delete();
return im; // return the path of the jpeg file, created on EFS
}
This code works perfectly when we have a small number of payloads arriving on the Kafka topic … but as the frequency of messages arriving on Kafka increase’s to something like hundreds or thousands – multiple instances of my Lambda function are invoked; and at some point we start getting Too Many Files error in the Log Stream.
So to solve the issue we manually update some configuration parameter of the Lambda which forces a cold start of the function and it starts to ok - for some while until the frequency of the payload increases.
So I think we still have some kind of File descriptor/ handle leak issue
It probably will not happen (or go unnoticed) if I move to a Spring Boot app deployed on a EC2.