Getting error:com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded. While converting HTML To Word

We are using aspose-words-16.10.0-jdk16 for transforming HTML file to Word file.
For following type of content we are getting:
com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.
Cause seems to be around display:list-item. If we use nested elements with style as display:list-item and try to convert to Word it fails.
This was working fine with aspose-words-15.6.0-jdk16.

.general
{
    display: block;
}
.ol
{
    display: list-item;
    list-style-type: decimal;
}

Hi Ravi,

Thanks for your inquiry.

While using the latest version of Aspose.Words i.e. 16.11.0, we managed to reproduce this issue on our end. We have logged this issue in our bug tracking system. The ID of this issue is WORDSNET-14529. Your request has also been linked to the appropriate issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best regards,

Hi Hafeez,

Thanks for the reply.
May I know what is the expected time for the fix. We are stuck with our release because of this.
Is there any code fix or workaround that we can do from our side to fix this issue?

Thanks
Ravi Shekhar

Hi Ravi,

Thanks for your inquiry. We regret to share with you that the implementation of this issue has been postponed for now. However, the fix of this problem may definitely come onto the product roadmap in the future. Unfortunately, we cannot currently promise a resolution date. We apologize for your inconvenience.

The error occurs, because currently Aspose.Words cannot handle nested list item elements. It is an impossible situation for normal list items (
), because our HTML parser prevents them from nesting. However, if elements that can be nested (for example, ) are turned into list items via the ‘display: list-item’ style, our HTML parser does nothing to prevent such elements from nesting and the resulting HTML tree causes an error in our list reader.

You need to stop using nested elements with the ‘display: list-item’ style for now, else Aspose.Words will refuse to load the document.

In your case, it is enough to remove the ‘class=“ol”’ attribute from the innermost
as a workaround:

<html>
<head>
    <style>
        .general {
            display: block;
        }

        .ol {
            display: list-item;
            list-style-type: decimal;
        }
    </style>
</head>
<body>
    <span class="general">
        <span class="ol">
            <div />
        </span>
    </span>
</body>
</html>

Best regards,

The issues you have found earlier (filed as WORDSNET-14529) have been fixed in this Aspose.Words for .NET 17.4 update and this Aspose.Words for Java 17.4 update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan