Free Support Forum -

Crash importing HTML file

I’ve found a reproducible crash importing some HTML documents. Simply constructing a Word document object from an appropriate HTML file triggers the problem:

Aspose.Word.Document wDoc = new Aspose.Word.Document(htmlFile);

I’m attaching a file that exhibits the problem. I think the problem may have to do either with certain constructs in a “style” attribute, such as a Color(0,0,0) specifier; or nested s. I haven’t tried to narrow it down precisely. Here are the exception details:

System.InvalidCastException: Specified cast is not valid.
at ó.?.get_ù()
at ó.?.a()
at ?.?.×(String ?, Font ù)
at ×.?.ü(’ c, Boolean é)
at ×.?.ê(’ c, Boolean é)
at ×.?.è(’ c, Boolean é)
at ×.?.ProcessNode(’ node)
at ×.?.ProcessNode(’ node)
at ×.?.ProcessNode(’ node)
at ×.?.ProcessNode(’ node)
at ×.?.ProcessNode(’ node)
at ×.?.?( ?)
at Aspose.Word.Document.?(Stream Q, String ?)
at Aspose.Word.Document…ctor(String fileName)


Thank you for considering Aspose.

Yes, style=“COLOR: rgb(255,204,204)” causes the problem. At the moment color is supported in hexadecimal notation only.

Thanks for the info. For now, I’ll work around by scanning the document for style attributes of this form, and converting them to hexadecimal notation before importing.

Are there plans to remove this limitation?