Aspose Word doc converted from HTML text gets too larger

Hi,
I am using older version of Aspose.words to generate word document from html text. The html text contains some images with low resolution size. Aspose generates the resulting document with 9 MB size. When i copy all the contents and save to a new word document the file size i get is 136 KB.
Is there any solution to this problem? Can the new version for aspose word be helpful for this?
Please suggest.
Thanks in advance.

Hi

Thanks for your inquiry. Could you please attach your HTML document here for testing? I will check the issue and provide you more information.
Best regards,

Hi,
Thanks for your reply. Please find attached file HTMLText.zip for more details.
Thanks again.

Hi

Thank you for additional information. I tried converting your HTML to DOC using the latest version of Aspose.Words and size of output document is ~1.5MB. After open/save using MS Word file size is significantly decreased.
It seems the problem occurs because there are a lot of merged cells in your document and content in merged cells is duplicated. You can use the following code to work the problem around:

Document doc = new Document(@"Test001\HTMLText.htm");
RemoveContentFromMergedCells(doc);
doc.Save(@"Test001\out.doc");
/// 
/// Remove content from merged cells.
///
public void RemoveContentFromMergedCells(Document doc)
{
    // Remove content from merged cells.
    // Get collection of cells in the docuemnt.
    NodeCollection cells = doc.GetChildNodes(NodeType.Cell, true);
    foreach(Cell cell in cells)
    {
        // Check whether cell is merged with previouse.
        if (cell.CellFormat.HorizontalMerge == CellMerge.Previous ||
            cell.CellFormat.VerticalMerge == CellMerge.Previous)
        {
            // Remove content from the cell.
            cell.RemoveAllChildren();
        }
    }
}

Hope this helps. Please let me know if you need more assistance, I will be glad to help you.
Best regards.

Hi,
I tried the code you gave and that worked !!! Now the generated document is having 189 KB size.
It is a very awesome solution.
Thanks a lot for this.

The issues you have found earlier (filed as WORDSNET-1739) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(3)