HtmlFragment limitation

Is there a limitation on HtmlFragment content size?
Rendering pdf with content length 5659 runs very fast, but when I add only one more character then Aspose hangs for some time and it ends up with out of memory exception.

Is there any way to avoid that?

Hi,


Thank you for contacting support. We did not find a performance degradation with content length 5660 as compared to 5659 when used the latest version 17.4.0 of Aspose.Pdf for .NET API. Kindly let us which Aspose.Pdf API version and development platform you are using.

If you are using an old version of the API, then please upgrade it to the latest version 17.4.0, otherwise please share details of your test case, including code example and the input file (if any). It will help us to replicate the problem in our working environment and we will share our findings with you.

Thank you very much for the prompt reply.

I am experiencing that on 17.4 Java version.
Test case is really simple, it is just:
final String PREFIX = "aspose/testfiles/index.html";
URL url = Resources.getResource(PREFIX);
String text = Resources.toString(url, Charsets.UTF_8);
// Instantiate Document object
Document doc = new Document();
// Add a page to pages collection of PDF file
Page page = doc.getPages().add();
// Instantiate HtmlFragment with HTML contents
HtmlFragment titel = new HtmlFragment(text);
// set MarginInfo for margin details
MarginInfo Margin = new MarginInfo();
Margin.setBottom(10);
Margin.setTop(200);
// Set margin information
titel.setMargin(Margin);
// Add HTML Fragment to paragraphs collection of page
page.getParagraphs().add(titel);
// Save PDF file
doc.save("output.pdf");

Hi,


Thank you for the details. Which assembly you are using for Resources class, kindly navigate us to its download repository or share Jar file. We will then further investigate and share our findings with you. Your response is awaited.

Please try to render
 new HtmlFragment(text)
with any html text with length > 5660,
it will take a long time and at the end it will lead to
exception java.lang.OutOfMemoryError: Java heap space

If text length is less then 5659 chars 
it will work just fine. 

Hi,


Thank you for the details. We have tested your code by passing an HTML string to the HtmlFragment class constructor. We have executed both test cases three times with string lengths 5665 and 5658 and the results are as follows:

Html string length: 5665
14895 milliseconds
15272 milliseconds
16641 milliseconds

Html string length: 5658
16946 milliseconds
16597 milliseconds
15060 milliseconds

We did not notice any out of memory error as well as the big difference in results. The Java heap sizes are Xms256m and Xmx1024m in our system.

Please retry that with:

-Xms128m
-Xmx750m

Hi,


Thank you for the inquiry. Please check results with -Xms128m and -Xmx750m as below:

Html string length: 5665
That took 14602 milliseconds
That took 14774 milliseconds
That took 14949 milliseconds

Html string length: 5658
That took 14788 milliseconds
That took 14564 milliseconds
That took 18377 milliseconds

Are you sure you are using html code in that text?


You can use that snippet to generate html:

import org.apache.commons.lang3.StringUtils;

String longHtmlText = StringUtils.repeat(“

TEST

”, 520);
HtmlFragment titel = new HtmlFragment(longHtmlText);


I checked again and I’m still getting: java.lang.OutOfMemoryError: Java heap space

Hi,


Thank you for the details. Our HTML string is different from yours, as we did not use any paragraph tag. Using your lines of code, we managed to replicate the out of memory error. It has been logged under the ticket ID PDFJAVA-36752 in our bug tracking system. When we increased Java heap size, it keeps running and does not generate an output PDF. We have linked your post to this ticket and will keep you informed regarding any available updates. We are sorry for the inconvenience caused.

What is the status of that bug?

@Damian_Szuta,
The ticket ID PDFJAVA-36752 is pending for analysis and not resolved yet. Our product team will investigate as per the development schedule. We will let you know once a significant progress has been made in this regard.

Best Regards,
Imran Rafique

Hey Damian,

It looks like you and I are having a very similar issue, except mine is in .NET. Since the Java code is derived from .NET, it’s likely that if either issue is fixed, both of our problems will go away. My ticket is PDFNET-44881 .

The difference with my issue is that it occurs when multiple HTML Fragments of substantial size are put into the same document, although none of my fragments get anywhere close to your 5000 string length. Besides this, the error message we receive and circumstances of the error are practically the same. There is some kind of problem with the way HTML Fragment is having memory allocated and its buffer is being processed.

@jsmith223,

Thank you for the details. The issue logged under the ticket ID PDFJAVA-36752 is similar to PDFNET-44881, but not the same. We will notify here in this thread once it is fixed.