Hello
Here is a small test demonstrating that converting an aspose document to html then back to aspose document , some informations are missing.
Maybe our save options are incomplete.
Could you have a look please and tell us if it’s possible to keep the original format as much as possible ? Thank you!
Here is the code:
private static HtmlSaveOptions getSaveOptions(final com.aspose.words.Document doc) {
final HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.HTML);
if (doc.getFirstSection().getPageSetup().getDifferentFirstPageHeaderFooter()) {
saveOptions.setExportHeadersFootersMode(ExportHeadersFootersMode.FIRST_PAGE_HEADER_FOOTER_PER_SECTION);
} else {
saveOptions.setExportHeadersFootersMode(ExportHeadersFootersMode.FIRST_SECTION_HEADER_LAST_SECTION_FOOTER);
}
saveOptions.setExportListLabels(ExportListLabels.BY_HTML_TAGS);
saveOptions.setExportTocPageNumbers(false);
saveOptions.setEncoding(StandardCharsets.UTF_8);
saveOptions.setExportImagesAsBase64(true);
return saveOptions;
}
@Test
void aspose_doc_to_html_to_aspose_doc() throws Exception {
// Create aspose doc from docx
final byte[] data = IOUtils
.toByteArray(new ClassPathResource("track_change_toc.docx").getInputStream());
final LoadOptions lo = new LoadOptions();
lo.setLoadFormat(LoadFormat.AUTO);
lo.setEncoding(StandardCharsets.UTF_8);
final com.aspose.words.Document doc = new com.aspose.words.Document(new ByteArrayInputStream(data), lo);
// Convert the aspose doc to html
final ByteArrayOutputStream bos = new ByteArrayOutputStream();
doc.save(bos, getSaveOptions(doc));
final String html = bos.toString(StandardCharsets.UTF_8);
// Convert the html to aspose doc
final com.aspose.words.Document docFromHtml = new com.aspose.words.Document();
final DocumentBuilder builderAspose = new DocumentBuilder(docFromHtml);
builderAspose.insertHtml(html, false);
builderAspose.getDocument().getCompatibilityOptions().setDoNotExpandShiftReturn(true);
builderAspose.getDocument().save("afterConvert.docx", new OoxmlSaveOptions(SaveFormat.DOCX));
// -> the resulting file is much different that the original one
}
The file track_change_toc.docx is attachedtrack_change_toc.docx (14.5 KB)