Private Topic:
I am trying to convert docx into html file using version 10.6.0.
The conversion of the following characters is corrupted:
’
“
For example:
I have Word document that contains the following line:
It is built from the characters ‘a-z’, ‘A-Z’, ‘0-9’, ‘.’ and ‘-‘
After the conversion i am getting html with the following line:
It is built from the characters ⵜa-z䮢, ⵜA-Z䮢, ⵜ0-9䮢, ⵜ.䮢 and ⵜ-ⵜ
This is the code:
Document doc = new Document(inputStr);
HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.HTML);
saveOptions.setEncoding(java.nio.charset.Charset.forName(“UTF-8”));
File out = new File(htmlFileName);
doc.save(out.getAbsolutePath(), saveOptions);