Aspose PDF/Word Java leaving open descriptors on latest respective versions

Hi,

We have been using Aspose words/pdf for Java (licensed Aspose total). We have recently come to the observation that aspose libraries might be leaving open file descriptors for our java process, and these FDs remain open even when the file is deleted with “(deleted)” under /proc//fd .

The only way to close/ terminate such open FDs is to restart the java process.

Following is a code example of our implementation of creating a pdf from a base 64 content.

AsposeFileDto createPDFAspose(String content) throws DocumentException, IOException
{
addAsposeLicense();
AsposeFileDto output= new AsposeFileDto();
byte[] byteValue;
String dest = “resources/”+ UUID.randomUUID().toString() + “.pdf”;

	byteValue = java.util.Base64.getDecoder().decode(content.getBytes());
	com.aspose.pdf.Document pdfDocument;
	ByteArrayInputStream pdfFile = new ByteArrayInputStream(byteValue);
    pdfDocument = new com.aspose.pdf.Document(pdfFile);
    pdfDocument.save(dest);
    pdfDocument.close();
    
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    com.aspose.pdf.Document pdfDocument1;
    pdfDocument1 = new com.aspose.pdf.Document(dest);
    pdfDocument1.save(baos);
    pdfDocument1.close();

    output.setBaos(baos);
	return output;	
}

note: that both pdf document variables were closed.

Sample document attached.
8 Years O&M Amendment Agreement [Executed Version 26-05-2015] compressed_compressed - Copy (7).pdf (357.1 KB)

Prompt response with valuable insight and possible solution will be highly appreciated.

@Waleed_Khalid

Unfortunately, we are unable to execute your code. Please create a sample Java application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.

Please share the complete steps how are you checking these values.

I will provide a simple java application for you debugging with simple pdf creation via base64 content, mean while

We are observing open descriptors via ls -ltr on the JVM process id location in /proc/PID/fd folder.

Normally FDs here open and close, but documents created through aspose have their descriptors visible here untill we restart the JVM. Ideally they should not appear in /proc/PID/fd location after the program has completed and the close() has been called within the code as done and shared below

and

Thanks.

Please download the sample project with the implementation of the code shared before. The code should leave open file descriptors as observed on our Linux RedHat environment.

Regards,
Waleed Khalid.

We need your response earlier please to fix the problem on production systems as due to this, we need to restart the application on daily basis.

@Waleed_Khalid

We have logged an investigation ticket as PDFJAVA-41456 in our issue tracking system. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

A post was split to a new topic: Aspose Word Java leaving open descriptors on latest respective versions

Tahir,

Thanks for your response. We will be thankful if we get response earlier as we need to restart production application server daily basis putting bad impact for business and also need to involve our infra teams.

regards

@asad50089

We will investigate the issue and share an update on it with you via this forum thread.

Tahir, do we have any update on this issue?

@Waleed_Khalid

Your issue is pending for analysis at the moment. Once we complete the analysis of your issue, we will then be able to provide you an estimate.

Any positive news Tahir?

@Waleed_Khalid

Unfortunately, there is no update available on this issue at the moment. We will inform you via this forum thread once there is any news available on it.

@tahir.manzoor

Please expedite a fix for this as soon as possible. Its been nearly 3 months since the issue was reported and open descriptors really choke the system once they cross a certain threshold against the ulimit parameter.

We are eagerly waiting for a fix.

Regards
Waleed Khalid

@Waleed_Khalid

It is to inform you that the issue which you are facing is actually not a bug in Aspose.PDF. So, we have closed this issue (PDFJAVA-41456) as ‘Not a Bug’.

We investigated your code in different environments (macOS, Ubuntu and Red Hat) and found some differences.

On the Red Hat OS (Red Hat Enterprise Linux release 8.4), remains open file descriptors, in our case that was only fonts. But this does not apply to the Aspose.PDF library for the reason that they are left open by Java itself. Aspose.PDF library only invokes the method for getting the list of available fonts and cannot close these font files.

For example, you can try to run following code snippet to got the open FDs:

GraphicsEnvironment localGraphicsEnvironment = GraphicsEnvironment.getLocalGraphicsEnvironment();
java.awt.Font[] fonts = localGraphicsEnvironment.getAllFonts();
System.out.println(Arrays.toString(fonts));

@tahir.manzoor,

Thanks for the update, We too are using font in our code as well for setting template. Something like

textStamp = new com.aspose.pdf.TextStamp(new com.aspose.pdf.facades.FormattedText(“Heading”, java.awt.Color.BLACK, java.awt.Color.WHITE, com.aspose.pdf.facades.FontStyle.Helvetica, com.aspose.pdf.facades.EncodingType.Winansi, true, 14));

For Long we believed that closing streams was the issue but as you suggested its something different altogether.

Please specify a workaround for remedy of this FD issue or a fix, be it Aspose java or fonts for that matter. Since last 2~3 months we have been regularly restarting JVM to keep FD count within check, but this too is getting tiresome for the teams as the load increases on PDF generation service using Aspose.

Waiting for a prompt response for resolution of this long running thread.

Regards,
Waleed Khalid

@Waleed_Khalid

We investigated this issue in detail and noticed that it is not related to Aspose.PDF. Could you please share following detail along with complete steps that you are following to reproduce the same issue at our end?

  • Java version
  • Linux Red Het version
  • Screenshots with names of opened FDs
  • Other details represent the issue in our environment

Thanks for your cooperation.

Hi,

We have observed similar behavior, in our application … our differences are;

  1. we use Aspose in a AWS Lambda (serverless) architecture and so there is no server we control.
  • the JVM is launched and terminated by AWS.
  1. We are using Aspose, HTML to Image (jpg)
  2. It happens under heavy load (rapidly arriving requests).

Our application, responds to events on a messaging topic and creates a jpeg.

For performance AWS Lambda, will re-use the Lambda function (called warm start) as opposed to starting afresh (cold start)

So when there are a lot of orders in a rapid sequence, a single invocation is re-used to serve multiple requests.
In each we have the same way
Create HtmlDocument object of Aspose and followed by Aspose Convertor.convertHTML()
… which presumably leaks some file handles, which accumulate to starve the OS of its resources.

So a time comes when AWS Lambda, terminates the warm instance and starts a fresh one (cold start) and operation proceeds smoothly for some more time until the above sequence repeats.

Currently struggling to locate work around;

thanks

@jnsunkersett

Would you please confirm if you are using the latest version of the API? Also, please share some more steps in order to replicate the issue in our environment so that we can try to replicate it and address it accordingly.

Hi,
I upgraded to
groupId> com.aspose – artifactID> aspose-html – version> 22.9-jdk1.8

And could reproduce the issue.

However, I feel it would be difficult for you to reproduce the issue are your end - as I feel it is very much due to the resources available on AWS Lambda environment.

Code wise, my lambda function, receives a Kafka trigger giving the location of a S3 bucket which have some artifacts. Subsequently the artifacts (XML etc) are parsed and an HTML is developed as a String and then using;

	public String createInfotechImage(PictureData picData) throws IOException {
	
	LOG.info("The template has been read and tokens replaced.");
	
	File htmlFile = File.createTempFile("infotech", ".html");
	
	OutputStream os = Files.newOutputStream(htmlFile.toPath());
	os.write(html.getBytes());
	os.close();
	
	String pathToJPG = htmlFile.getAbsolutePath();
	
	Url urlToPage = new Url(pathToJPG);
	String imageFile = File.createTempFile("myPictur", ".jpg").getAbsolutePath();
	// Initialize an HTML document from the html file
	com.aspose.html.HTMLDocument document = new com.aspose.html.HTMLDocument(urlToPage);

	try {
		com.aspose.html.saving.ImageSaveOptions options = new com.aspose.html.saving.ImageSaveOptions(com.aspose.html.rendering.image.ImageFormat.Jpeg);

		com.aspose.html.rendering.PageSetup pageSetup = new com.aspose.html.rendering.PageSetup();
		com.aspose.html.drawing.Page anyPage = new com.aspose.html.drawing.Page();

		options.setHorizontalResolution(Resolution.fromDotsPerInch(300));
		options.setVerticalResolution(Resolution.fromDotsPerInch(300));
		options.setBackgroundColor(com.aspose.html.drawing.Color.fromName(picData.getBkColor()));

		anyPage.setSize(new com.aspose.html.drawing.Size(
							com.aspose.html.drawing.Length.fromPixels(picData.getWidth()),
							com.aspose.html.drawing.Length.fromPixels(picData.getHeight())));
		
		pageSetup.setAnyPage(anyPage);
		options.setPageSetup(pageSetup);
		
		// Convert HTML to PNG
		com.aspose.html.converters.Converter.convertHTML(document, options, imageFile);

	} finally {
		if (document != null) {
			document.dispose();
		}
	}
	LOG.debug("Image File {}", imageFile);
	File nf = new File(imageFile);
	String im = nf.getPath();
	LOG.info("The Infotech Image File has been created. {}", im);
	
	htmlFile.delete();
	
	return im; // return the path of the jpeg file, created on EFS

}

This code works perfectly when we have a small number of payloads arriving on the Kafka topic … but as the frequency of messages arriving on Kafka increase’s to something like hundreds or thousands – multiple instances of my Lambda function are invoked; and at some point we start getting Too Many Files error in the Log Stream.

So to solve the issue we manually update some configuration parameter of the Lambda which forces a cold start of the function and it starts to ok - for some while until the frequency of the payload increases.

So I think we still have some kind of File descriptor/ handle leak issue
It probably will not happen (or go unnoticed) if I move to a Spring Boot app deployed on a EC2.

thanks