I am translating RTF documents to HTML with Aspose.Words for Java.
One RTF document uses the UTF8 character 8226 to show a bullet. in RTF it is encoded \u8226. Using windows-1252 encoding translates this correctly into HTML in Aspose. Otherwise it shows up as double quotes (") in other character sets.
The other bullet is a paragraph formatting in RTF using \pnlvlblt. Using character encoding ISO-8859-1 translates this correctly into HTML in Aspose. Otherwise it shows up as a question mark (?) in other character sets.
Attached is a document that has both types of bullets in it. Typically a document will only have one or the other type of bullets.
Here is the code I am using for translation.
Document rtfdocument = new Document(in); HtmlSaveOptions hso = new HtmlSaveOptions(SaveFormat.HTML); hso.setPrettyFormat(true); hso.setAllowNegativeIndent(true); hso.setCssStyleSheetType( CssStyleSheetType.INLINE); // .EMBEDDED); hso.setEncoding(Charset.forName("ISO-8859-1")); // windows-1252 is a subset of ISO-8859-1 hso.setExportHeadersFootersMode(ExportHeadersFootersMode.NONE); hso.setExportImagesAsBase64(true); rtfdocument.save(htmlout, hso);
Is there anything I can do to make Aspose handle both bullet types with just one character set?