Keep CSS Class Name and ID of an HTML element with Aspose Words

Hello,

I am processing a file with Aspose Words. The input format is an HTML document (that is not a valid xhtml) and the output is HTML as well. I would like to keep the CSS Class name and/or ID of an element (please see bellow the html talbe).

Can you please let me know if that’s possible because I wasn’t able to do it?

The input file is the following:

<html xmlns=“http://www.w3.org/1999/xhtml”>
<head>
    <title>Some Title</title>
</head>
<body>
    <table id=“123” class=“ABC”>
        <tr>
            <th>A</th>
            <th>B</th>
        </tr>
        <tbody>
            <tr>
                <td>1</td>
            </tr>
        </tbody>
    </table>
</body>
</html>

The output looks like that:

<html>
<head>
    <meta http-equiv=“Content-Type” content=“text/html; charset=UTF-8” />
    <meta http-equiv=“Content-Style-Type” content=“text/css” />
    <meta name=“generator” content=“Aspose.Words for Java 21.1.0” />
    <title>Some Title</title>
</head>
<body style=“font-family:‘Times New Roman’; font-size:12pt”>
    <div>
        <p style=“margin-top:0pt; margin-bottom:0pt”>
            <span> </span>
        </p>
        <table cellspacing=“2” cellpadding=“0” style=“border-spacing:1.5pt”>
            <tr>
                <td style=“padding:0.75pt; vertical-align:middle”>
                    <p style=“margin-top:0pt; margin-bottom:0pt; text-align:center; font-size:12pt”>
                        <span style=“font-weight:bold”>A</span>
                    </p>
                </td>
                <td style=“padding:0.75pt; vertical-align:middle”>
                    <p style=“margin-top:0pt; margin-bottom:0pt; text-align:center; font-size:12pt”>
                        <span style=“font-weight:bold”>B</span>
                    </p>
                </td>
            </tr>
            <tr>
                <td style=“padding:0.75pt; vertical-align:middle”>
                    <p style=“margin-top:0pt; margin-bottom:0pt; font-size:12pt”>
                        <span>1</span>
                    </p>
                </td>
                <td style=“padding:0.75pt; vertical-align:middle”>
                    <p style=“margin-top:0pt; margin-bottom:0pt; font-size:12pt”>
                        <span> </span>
                    </p>
                </td>
            </tr>
        </table>
        <p style=“margin-top:0pt; margin-bottom:0pt”>
            <span> </span>
        </p>
    </div>
</body>
</html>

I am using the Java API with the following configuration:

HtmlSaveOptions options = new HtmlSaveOptions();
options.setEncoding(UTF_8);
options.setHtmlVersion(XHTML);
options.setExportXhtmlTransitional(true);
options.setExportImagesAsBase64(true);
options.setOfficeMathOutputMode(HtmlOfficeMathOutputMode.MATH_ML);
options.setExportRoundtripInformation(false);

@rkirkov Unfortunately, there is no way to keep original HTML document structure after processing it using Aspose.Words.
Please note, Aspose.Words is designed to work with MS Word documents. While loading HTML document, it is converted to Aspose.Words DOM and due to differences in HTML documents and MS Word documents object models it is not always possible to provide 100% fidelity after processing HTML document.

1 Like