Minimal Broken page-Template.zip (46.9 KB)
The extension of the attached file can be changed from zip to docx - this is a docx file.
We are currently using Aspose.Words java version 20.1
When saving to HTML using the Java code below, an error is generated.
Deleting almost any single part of the file (it is very small) makes it work.
This is weird. A workaround - if one exists - would be great, and a fix welcome, of course.
Thanks!
String convertToHTML(File docFile) { String htmlDoc = null; try { InputStream inpStr = new FileInputStream(docFile); LoadOptions loadOptions = new LoadOptions(LoadFormat.DOCX, "", null); Document inputDoc = new Document(inpStr, loadOptions); inpStr.close(); inputDoc.joinRunsWithSameFormatting(); // html cleanup stage one ByteArrayOutputStream outStr = new ByteArrayOutputStream(); HtmlSaveOptions saveOptions = new HtmlSaveOptions(); saveOptions.setEncoding(Charset.forName("UTF-8")); inputDoc.save(outStr, saveOptions); htmlDoc = outStr.toString("UTF-8"); outStr.close(); } catch(Exception ex) { logger.error(ExceptionUtils.getStackTrace(ex)); } return htmlDoc; }