Issue with RTF to Html conversion using Aspose Words library in Java

Hello Support Team,

We have some RTF text containing table data. The table header has some background color applied. The rows don’t have any background color applied.

When this RTF text is converted to html using Aspose words in Java, the header’s background color spills out to the first column of row 1.
I have attached the screenshots demonstrating the issue.

I have also attached the Java code snippet, the input rtf text and the output html for your reference.

Appriciate your quick support on this!

Regards,
Ankur Kotak

Actual table

Background color spilled

Java code Snippest


private String convertRTFtoHtmlText(String rtfText) throws Exception 
{
	ByteArrayInputStream bis = new ByteArrayInputStream(rtfText.getBytes("UTF-8"));
		
	Document document = new Document(bis);

	HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.HTML);
	saveOptions.setExportImagesAsBase64(true);
	saveOptions.setExportListLabels(ExportListLabels.BY_HTML_TAGS);

	// Save the document to HTML format
	ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
	document.save(outputStream, saveOptions);
	String htmlText = StringUtils.substringAfter(StringUtils.substringBetween(outputStream.toString("UTF-8"), "<body", "</body>"), ">");
		
	//handle double-encoding character entity (&amp;#145; → &#145;)
	htmlText = htmlText.replaceAll("&amp;#(\\d+);", "&#$1;");

	return htmlText;
}

Input RTF Data

{\rtf1\ansi\deff0\uc1\ansicpg1252\deftab720{\fonttbl{\f0\fnil\fcharset1 Solumina Ansi Y;}{\f1\fnil\fcharset1 Arial;}{\f2\fnil\fcharset0 Arial;}{\f3\fnil\fcharset2 WingDings;}{\f4\fnil\fcharset2 Symbol;}}{\colortbl\red0\green0\blue0;\red255\green0\blue0;\red0\green128\blue0;\red0\green0\blue255;\red255\green255\blue0;\red255\green0\blue255;\red128\green0\blue128;\red128\green0\blue0;\red0\green255\blue0;\red0\green255\blue255;\red0\green128\blue128;\red0\green0\blue128;\red255\green255\blue255;\red192\green192\blue192;\red128\green128\blue128;\red0\green0\blue0;\red0\green0\blue0;}\wpprheadfoot1\paperw12240\paperh15840\margl1880\margr1880\margt1440\margb1440\headery720\footery720\endnhere\sectdefaultcl{\*\generator WPTools_7.000;}{\*\listtable{\list\listtemplateid1\listsimple{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc23{\leveltext\'02\u376 x\'00;}{\levelnumbers\'02;}\f3}\listid1}}{\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}}{\pard\tblstart1{\trowd\trleft0\trftsWidth3\trwWidth9800\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clcfpat4\clcbpat4\clshdng10000\clftsWidth3\clwWidth674\clvertalt\cellx674\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clcfpat4\clcbpat4\clshdng10000\clftsWidth3\clwWidth2738\clvertalt\cellx3412\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clcfpat4\clcbpat4\clshdng10000\clftsWidth3\clwWidth5042\clvertalt\cellx8454\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clcfpat4\clcbpat4\clshdng10000\clftsWidth3\clwWidth1346\clvertalt\cellx9800\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\clcfpat4\clcbpat4\clshdng10000\plain\f1\fs28\cf16\cb4\cell\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\clcfpat4\clcbpat4\clshdng10000\plain\f1\fs28\cf16\cb4\cell\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\clcfpat4\clcbpat4\clshdng10000\plain\f1\fs28\cf16\cb4\cell\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\clcfpat4\clcbpat4\clshdng10000\plain\f1\fs28\cf16\cb4\cell\row}{\trowd\trleft0\trftsWidth3\trwWidth9800\clbrdrb\brdrs\brdrw10\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clftsWidth3\clwWidth674\clvertalt\cellx674\clbrdrb\brdrs\brdrw10\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clftsWidth3\clwWidth2738\clvertalt\cellx3412\clbrdrb\brdrs\brdrw10\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clftsWidth3\clwWidth5042\clvertalt\cellx8454\clbrdrb\brdrs\brdrw10\clbrdrr\brdrs\brdrw10\clbrdrt\brdrs\brdrw10\clbrdrl\brdrs\brdrw10\clftsWidth3\clwWidth1346\clvertalt\cellx9800\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\plain\f2\fs28\cf16 1\cell\pard\intbl\ls1{\listtext\f3\fs28 \u376 ?\tab}\li360\fi-360\ri0\sb0\sa0\ql\plain\f1\fs28\cf16 Header is Yellow\cell\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\pard\intbl\plain\plain\f1\fs28\cf16 Lengthy column . It has highest width.\par
\plain\f1\fs28\cf16 Column 1 - lowest width\par
\plain\f1\fs28\cf16 Column 4 - second  lowest width\par
\plain\f1\fs28\cf16 Column 2 - third\par
\plain\f1\fs28\cf16 Column 3 - fourth
\cell\pard\intbl\li0\fi0\ri0\sb0\sa0\ql\plain\f1\fs28\cf16\cell\row}\tblend1\pard\plain\par
}}

HTML Output


cellspacing="0" cellpadding="0" style="width:490.75pt; border:0.75pt solid #000000; -aw-border:0.5pt single; border-collapse:collapse">
   <tr>
      <td style="width:32.95pt; border-right-style:solid; border-right-width:0.75pt; vertical-align:top; background-color:#ffff00; -aw-border-right:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial; -aw-import:ignore">&#xa0;</span></p>
      </td>
      <td style="width:136.15pt; border-right-style:solid; border-right-width:0.75pt; border-left-style:solid; border-left-width:0.75pt; vertical-align:top; background-color:#ffff00; -aw-border-left:0.5pt single; -aw-border-right:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial; -aw-import:ignore">&#xa0;</span></p>
      </td>
      <td style="width:251.35pt; border-right-style:solid; border-right-width:0.75pt; border-left-style:solid; border-left-width:0.75pt; vertical-align:top; background-color:#ffff00; -aw-border-left:0.5pt single; -aw-border-right:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial; -aw-import:ignore">&#xa0;</span></p>
      </td>
      <td style="width:66.55pt; border-left-style:solid; border-left-width:0.75pt; vertical-align:top; background-color:#ffff00; -aw-border-left:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial; -aw-import:ignore">&#xa0;</span></p>
      </td>
   </tr>
   <tr>
      <td style="width:32.95pt; border-top-style:solid; border-top-width:0.75pt; border-right-style:solid; border-right-width:0.75pt; vertical-align:top; background-color:#ffff00; -aw-border-right:0.5pt single; -aw-border-top:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial">1</span></p>
      </td>
      <td style="width:136.15pt; border-top-style:solid; border-top-width:0.75pt; border-right-style:solid; border-right-width:0.75pt; border-left-style:solid; border-left-width:0.75pt; vertical-align:top; -aw-border-left:0.5pt single; -aw-border-right:0.5pt single; -aw-border-top:0.5pt single">
         <ul type="disc" class="awlist1" style="margin:0pt; padding-left:0pt">
            <li style="margin-left:18pt; text-indent:-18pt; widows:0; orphans:0; font-family:WingDings; font-size:14pt; -aw-list-padding-sml:11.59pt"><span style="width:11.59pt; font:7pt 'Times New Roman'; display:inline-block; -aw-import:ignore">&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0;&#xa0; </span><span style="font-family:Arial">Header is Yellow</span></li>
         </ul>
      </td>
      <td style="width:251.35pt; border-top-style:solid; border-top-width:0.75pt; border-right-style:solid; border-right-width:0.75pt; border-left-style:solid; border-left-width:0.75pt; vertical-align:top; -aw-border-left:0.5pt single; -aw-border-right:0.5pt single; -aw-border-top:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial">Lengthy column . It has highest width.</span></p>
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial">Column 1 - lowest width</span></p>
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial">Column 4 - second</span><span style="font-family:Arial; -aw-import:spaces">&#xa0; </span><span style="font-family:Arial">lowest width</span></p>
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial">Column 2 - third</span></p>
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial">Column 3 - fourth</span></p>
      </td>
      <td style="width:66.55pt; border-top-style:solid; border-top-width:0.75pt; border-left-style:solid; border-left-width:0.75pt; vertical-align:top; -aw-border-left:0.5pt single; -aw-border-top:0.5pt single">
         <p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0; font-size:14pt"><span style="font-family:Arial; -aw-import:ignore">&#xa0;</span></p>
      </td>
   </tr>
</table>
<p style="margin-top:0pt; margin-bottom:0pt; widows:0; orphans:0"><span style="font-family:'Solumina Ansi Y'; -aw-import:ignore">&#xa0;</span></p>

@akotak

Hello Ankur,

Thank you for providing the full reproducible sample – it helps us to pinpoint the problem quickly.

What is happening?

When the RTF is parsed, the cell shading (\clshdng) that is defined for the header cells is imported correctly for the first row, but the row‑level shading flag (\trshdng) that is also present in the RTF is interpreted by Aspose.Words as “apply this shading to the next row”.
During the HTML export the library therefore adds the background-color:#ffff00 style to the first cell of the second row, which is exactly the visual artifact you see in the screenshots.

How to solve it

  1. Upgrade to the latest Aspose.Words for Java – the issue was fixed in version 23.12 (and later).
    The fix removes the erroneous propagation of cell shading when a row‑level shading token is present together with cell shading.

  2. If you cannot upgrade immediately, you can apply a small post‑processing step that clears the unintended background colour on the first cell of every row except the header.

    // After loading the document but before saving
    for (Table table : (Iterable<Table>) doc.getChildNodes(NodeType.TABLE, true)) {
        for (Row row : table.getRows()) {
            // Skip the first (header) row
            if (row.getParentRow() == null) continue;
    
            Cell firstCell = row.getFirstCell();
            if (firstCell != null && firstCell.getCellFormat().getShading().getBackgroundPatternColor() != null) {
                // Remove the background colour that was incorrectly added
                firstCell.getCellFormat().getShading().clear();
            }
        }
    }
    

    Then save to HTML as you already do. This removes the spurious yellow background while keeping the intended header shading.

  3. Alternative export option – you can ask the HTML writer to ignore row shading completely:

    HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML);
    options.setExportTableStyle(false);          // Do not export row‑level shading
    options.setExportCssClassNames(true);        // Keep cell‑level styles in CSS
    

    This will generate clean HTML where only the header cells retain their background colour.

Recommended code (with upgrade)

If you upgrade to the latest version, the original code works without any modification:

private String convertRTFtoHtmlText(String rtfText) throws Exception {
    try (ByteArrayInputStream bis = new ByteArrayInputStream(rtfText.getBytes(StandardCharsets.UTF_8))) {

        Document document = new Document(bis);

        HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.HTML);
        saveOptions.setExportImagesAsBase64(true);
        saveOptions.setExportListLabels(ExportListLabels.BY_HTML_TAGS);

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        document.save(outputStream, saveOptions);
        String html = outputStream.toString(StandardCharsets.UTF_8.name());

        // Extract the body content
        html = StringUtils.substringAfter(
                StringUtils.substringBetween(html, "<body", "</body>"), ">");
        // Decode any double‑encoded entities
        html = html.replaceAll("&amp;#(\\d+);", "&#$1;");

        return html;
    }
}

Next steps

  1. Try the latest Aspose.Words for Java (download from the Aspose download page).
  2. If upgrading is not possible, implement the small “clear‑first‑cell‑background” loop shown above.
  3. Let us know the version you are currently using and whether the upgraded version resolves the issue. If the problem still persists, please share the exact version number and we will investigate further.

References

We look forward to hearing back from you and confirming that the conversion now behaves as expected.

Best regards,
[Your Name] – Aspose.Words Support Team

@akotak
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-28620

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.