Converting html to png / pdf: size of png is too big

We are converting a html document into a png. We want the png to fit the size of the rendered html. But the png image is much bigger than the html content (I guess it’s A4). We are facing the same problem using aspose.words. The used SaveFormat doesn’t make a difference (we also used PDF). Please see the attached files.

aspose.words code snippet:
Document htmlDocument = new Document();
htmlDocument.ensureMinimum();
DocumentBuilder builder = new DocumentBuilder(htmlDocument);
builder.insertHtml(htmlStringtoRenderImage); //the unpacked “table.zip”
ByteArrayOutputStream bos = new ByteArrayOutputStream();
htmlDocument.save(bos, SaveFormat.PNG);
renderedImage = bos.toByteArray();

aspose.html code snippet:
HTMLDocument htmlDocument = new HTMLDocument(htmlFilePath));
ImageRenderingOptions options = new ImageRenderingOptions();
HtmlRenderer renderer = new HtmlRenderer();
ImageDevice device = new ImageDevice(options, pngFileName);
renderer.render(device, htmlDocument);
device.saveGraphicContext();

generated_with_aspose.html.png (9.0 KB)
generated_with_aspose.words.png (5.1 KB)
table.zip (1.2 KB)
generated_with_aspose.words.png (5.1 KB)
generated_with_aspose.html.png (9.0 KB)
generated_with_apsose.words.pdf.zip (45.3 KB)

@fbu

Thank you for contacting support.

I have worked with the data shared by you and would like to request you to check attached PNG and PDF file and then specify if you are expecting such a PDF file or a PNG file. Please elaborate your requirements a little. Expected_Output.zip. If your requirements are different than the attached files, then please share a file as your expected output so that we may proceed further to help you out.

Thanks for your reply. The PDF file and the PNG file are exactly what I’m expecting. How can I produce them?

@fbu

You can convert a HTML file to a PDF file with Aspose.Words, Aspose.PDF or Aspose.HTML API, and then using Aspose.PDF, you can trim white space on a PDF page by using the code below:

// load the source PDF document
Document document = new Document(dataDir + "Source.pdf");


// get page to trim white space
com.aspose.pdf.Page pdfPage = document.getPages().get_Item(1);

// get the content boundaries
com.aspose.pdf.Rectangle contentBBox = pdfPage.calculateContentBBox();

// set Page CropBox and MediaBox as per content boundries to tirm white space
pdfPage.setCropBox(contentBBox);
pdfPage.setMediaBox(contentBBox);

// save the resultant PDF
document.save(dataDir + "output_trim.pdf"); 

then you can generate a PNG image of resultant PDF file with Aspose.PDF API, as explained in Convert PDF Pages to PNG Images.

I hope this will be helpful. Please feel free to let us know if you need any further assistance.

Hello. I see that this approach can utilize an intermediate PDF generation step in order to produce an image containing just the HTML content. I’d like to follow up to see whether there is a more direct way - i.e. starting from an HTMLDocument, and not knowing in advance the sizing of the HTML content, can we call Converter.ConvertHTML with some set of options to produce a TIFF image which contains only the actual content and no extra whitespace?

@jamesflagg

We are checking this and will get back to you soon.

@jamesflagg

Thank you for creating separate topic.

We have requested you for sharing the data. Kindly follow up in [respective thread](https://forum.aspose.com/t/generate-tiff-based-on-html-want-result-cropped-to-content/202474).