I am evaluating the Aspose components (Words, Pdf, Cells, Slides) for file conversions and I encounter a problem with Html file conversion.
I use the Aspose.Words component to convert Html files to jpeg images. With some Html files, the conversion generates too much pages. I have for example an Html that generates 30 pages whereas Intenet Explorer displays only 2 pages in its print preview.
If I convert the same Html file to an .RTF file thanks to Microsoft Office Word and then I convert the .RTF with Aspose.Word, I have exactly the same problem.
If I convert the Html file to a .DOC file with Microsoft Office Word and then I convert the .DOC file with
Aspose.Word, I have not the same error but the result is not good too.
The Html file is non public so I can’t attach it to this post. If you want these files, give me an email address where I can send them.
Thank you for additional information. Partially the problem occurs because there are merged cells in your HTML document. This is the issue #7739 in our defect database. As a workaround, you can try removing content from merged cells as shown in the following code:
Document doc = new Document(@"in.html", LoadFormat.Html, "");
// Remove content from merged cells
// Get collection of cells in the docuemnt
NodeCollection cells = doc.GetChildNodes(NodeType.Cell, true);
foreach(Cell cell in cells)
{
// Check whether cell is merged with previouse
if (cell.CellFormat.HorizontalMerge == CellMerge.Previous ||
cell.CellFormat.VerticalMerge == CellMerge.Previous)
{
// Remove content from the cell
cell.RemoveAllChildren();
}
}
doc.SaveToPdf(@"out.pdf");
However, the output width of the table is incorrect. This is the issue #8579 in our defect database. I will notify you as soon as these issues are resolved.
Best regards.
Hi,
Thank you for your fast answer. I try the code that you send to me. Really, it permits to reduce the number of generated images (only 3 images now). I will inform you, if I have other problems.
Best regards.