Hi,
Our usecase involves decoding a base64 encoded HTML string, and then converting the HTML content into PDF. However, we noticed that it consistently takes longer than 5 minutes for the conversion to take place.
We also noticed that the resultant PDF does not have correct styles. We went through some examples provided by Aspose, and noticed that the inline-css used by Aspose seems to be formatted a bit differently (single quotes vs double quotes, etc). Is there any guide on how HTML should be formatted?
I am attaching a sample code snippet below.
We use a different HTML, but I was facing the same issues with the attached HTML file as well.
sample-html.zip (37.0 KB)
String decodedHtml;
try {
decodedHtml = new String(Base64.getDecoder().decode(base64EncodedHtml));
} catch (Exception e) {
throw new RuntimeException("Could not decode the visualizations html");
}
HtmlLoadOptions htmlLoadOptions = new HtmlLoadOptions();
htmlLoadOptions.getPageInfo().setHeight(PageSize.getA4().getHeight());
htmlLoadOptions.getPageInfo().setWidth(PageSize.getA4().getWidth());
ByteArrayInputStream inputStream = new ByteArrayInputStream(decodedHtml.getBytes(StandardCharsets.UTF_8));
Document pdfDocument = new Document(inputStream, htmlLoadOptions);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
pdfDocument.save(outputStream);
HttpHeaders headers = new HttpHeaders();
headers.add("Cache-Control", "no-cache, no-store, must-revalidate");
headers.add("Pragma", "no-cache");
headers.add("Expires", "0");
headers.setContentType(MediaType.APPLICATION_PDF);
headers.add("Content-Disposition", "attachment; filename=output.pdf;");
return new ResponseEntity<>(outputStream.toByteArray(),
headers,
HttpStatus.OK);
@shashwatrathod
It seems that you are experiencing performance issues and style discrepancies when converting HTML to PDF using Aspose. Here are some suggestions to help you address these concerns:
Performance Optimization
-
Resource Loading: If your HTML references external resources (like images or stylesheets), consider using a custom resource loader to manage how these resources are loaded. This can help reduce the time taken for conversion. You can implement a custom loader that either loads resources from a local path or returns empty data for resources that are not critical.
Here’s a basic example of how to set up a custom resource loader:
HtmlLoadOptions options = new HtmlLoadOptions();
options.setCustomLoaderOfExternalResources(new LoadOptions.ResourceLoadingStrategy() {
public LoadOptions.ResourceLoadingResult invoke(String resourceURI) {
// Load resources as needed or return empty if not critical
return new LoadOptions.ResourceLoadingResult(new byte[]{});
}
});
-
Page Size Configuration: Ensure that you are setting the page size correctly. You can specify the dimensions of the PDF pages to match your HTML content, which may help in rendering performance.
Style Formatting
Regarding the styles not being applied correctly, it’s important to ensure that your HTML and CSS are formatted in a way that Aspose can interpret correctly. Here are some tips:
-
Inline CSS: Aspose tends to work better with inline CSS. If possible, try to include styles directly within the HTML elements rather than relying on external stylesheets.
-
Quote Consistency: Ensure that you are consistent with the use of quotes in your HTML attributes. While this may seem minor, inconsistencies can sometimes lead to parsing issues.
-
HTML Structure: Make sure your HTML is well-structured and valid. Use tools like HTML validators to check for any errors that might affect rendering.
Example Code
Here’s a refined version of your code snippet that incorporates a custom resource loader:
String decodedHtml;
try {
decodedHtml = new String(Base64.getDecoder().decode(base64EncodedHtml));
} catch (Exception e) {
throw new RuntimeException("Could not decode the visualizations html");
}
HtmlLoadOptions htmlLoadOptions = new HtmlLoadOptions();
htmlLoadOptions.getPageInfo().setHeight(PageSize.getA4().getHeight());
htmlLoadOptions.getPageInfo().setWidth(PageSize.getA4().getWidth());
htmlLoadOptions.setCustomLoaderOfExternalResources(new LoadOptions.ResourceLoadingStrategy() {
public LoadOptions.ResourceLoadingResult invoke(String resourceURI) {
// Custom logic for loading resources
return new LoadOptions.ResourceLoadingResult(new byte[]{});
}
});
ByteArrayInputStream inputStream = new ByteArrayInputStream(decodedHtml.getBytes(StandardCharsets.UTF_8));
Document pdfDocument = new Document(inputStream, htmlLoadOptions);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
pdfDocument.save(outputStream);
HttpHeaders headers = new HttpHeaders();
headers.add("Cache-Control", "no-cache, no-store, must-revalidate");
headers.add("Pragma", "no-cache");
headers.add("Expires", "0");
headers.setContentType(MediaType.APPLICATION_PDF);
headers.add("Content-Disposition", "attachment; filename=output.pdf;");
return new ResponseEntity<>(outputStream.toByteArray(), headers, HttpStatus.OK);
By implementing these suggestions, you should see improvements in both the performance of the conversion and the accuracy of the styles in the resultant PDF. If you continue to experience issues, consider sharing specific examples of the HTML content that is causing problems for further assistance.