InsertHTML lose style informations

Hi,
Actually, I use the evaluation version Aspose.Words.5.0.2.0.
I try to include an HTML file in the document.
DocumentBuilder.InsertHTML method doesn’t produce the exact representation of the html table.
Font and colors in cells table have not been included in the .doc file.
In the attached zip file, you can find the html source file and the the generated .doc file.
I suppose that the tag in the
table is partially ignored.
The exact representation of the html table is very important for me.
Thanks for your help.

Hello Christophe!
Thank you for your interest in Aspose components.
There are several ways to specify properties in HTML. I’m sorry. We don’t support all of them. Aspose.Words currently assumes format described in CSS2 specification, also with some restrictions.
To see how we represent HTML and what we assume from the input files you can make a roundtrip. First open your file with MS Word, save as DOC. Then open with Aspose.Words and save as HTML. There will be several nodes in place of one in the source and miscellaneous paddings etc. But there you can see that we deal with something like this:
style="font-family:‘Arial’; font-size:8pt; font-weight:bold; color:#000000; "
Here is the link to CSS2 specification:
https://www.w3.org/TR/CSS2/
Although HTML support in Aspose.Words is not one of the main features we plan to improve it in the future. This includes import and export files with embedded/external CSS, and probably support of different dialects of formatting. Embedded CSS export is coming soon, maybe in two weeks. All the other things are not for nearest future. I won’t promise any dates. Thank you for understanding.
There are a plenty issues regarding these improvements in our defect database:
#549 – Add support for tables styles during HTML import
#4247 – Import/export HTML align (and possibly CSS text-align) attribute on a table
#1145 – Separate styles from content when exporting to HTML
etc…
I can help you in overcoming these restrictions if you need to improve output layout. In particular these data in the table could be taken from some database or any other source. We can just form DOC document without need of intermediate HTML. At last if you need HTML too then we’ll save DOC as HTML at the end of construction.
Please let me know what I can do for you further including help with this possible workaround.
Regards,

Hello Aspose Team,

I did some new tests, without a lot of success…

I have attached a zip file which two files inside :

  • a very small and easy html file : test.htm
  • a word file generated with InsertHtml(test.htm) method.

the html code :

<html>
<head>
</head>
<body>
    <table border="1" width="500px" bgcolor="#00FF00" style="font-family:Arial;font-size:16;color:#0000FF;">
        <tr>
            <td>AAA</td>
            <td>BBB</td>
        </tr>
        <tr>
            <td>CCC</td>
            <td style="font-family:Arial;font-size:25px;font-weight:bold;color:red;">DDD</td>
        </tr>
    </table>
</body>
</html>

I think that I have written the right elements in style tag, but it seems that styles in the table are ignored.
What I have to write in the html file to generate a good word file ?

Best regards,

Hello!
Thank you for your feedback.
I experimented with your HTML and refactored it to fit current Aspose.Words restrictions. Character formatting should be applied at span level. I know that it’s boring to repeat styles in several cells and to have redundant
and elements. But currently it works only this way. HTML import it one of the subsystems we are going to refactor and much improve as I already wrote. Sorry for inconvenience.
(Attachment is an HTML with changed extension since our forum doesn’t allow attaching HTML files directly.)
Have a nice day,

Hello,

I am experiencing the same thing, have there been any updates to the InsertHTML method since htis post was made in February?

Thanks,
Chris

Hi Chris,
Thank you for your request. I’ll ask our developer responsible for this feature and he will get back to you in 2-3 days.

Hello!
InsertHtml method and HTML import were not improved this way. They still have some restrictions in applying styles. As I wrote everything is straightforward: specify paragraph formatting on p, h1…h6 elements and font formatting on span elements. Let me know if you experience particular difficulties with this.
Regards,

The issues you have found earlier (filed as WORDSNET-318) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(86)