Images in header are repeated after docx is converted as HTML

Hello,

data.zip (2.5 MB)

I evaluating your product, especially conversion of docx to HTML. It looks better that the current component we use, but I found an issue with images in headers. They are exported for each of the pages.

I use CssStyleSheetType = CssStyleSheetType.Inline, ExportImagesAsBase64 = true, ImageResolution = 300 for HtmlSaveOptions.

You can find the attached document and the converted file.

Any ideas or suggestions are welcome.

Regards

@profiler This is an expected behavior. It is hard to meaningfully output headers and footers to HTML because HTML is not paginated. By default Aspose.Words exports only primary headers/footers of the document per section when saving to HTML. In your case, there are two section in the document, so you see primary header of each section in the output HTML. You can try changing ExportHeadersFootersMode to configure how headers/footers are exported to HTML.

You should note, however, that HTML documents and MS Word documents object models are quite different and it is not always possible to provide 100% fidelity after conversion one format to another.

PS: If the output HTML is for viewing purposes, i.e. it is not supposed to be edited or processed, you can consider using HtmlFixed format. In this case the output should look exactly the same as it looks in MS Word:

Document doc = new Document("C:\\temp\\in.docx");
HtmlFixedSaveOptions opt = new HtmlFixedSaveOptions();
opt.setExportEmbeddedCss(true);
opt.setExportEmbeddedFonts(true);
opt.setExportEmbeddedImages(true);
opt.setExportEmbeddedSvg(true);
doc.save("C:\\Temp\\out.html", opt);

HtmlFixed format is designed to preserve original document layout for viewing purposes. So if your goal is to display the HTML on page, then this format can be considered as an alternative. But unfortunately, it does not support roundtrip to DOCX at all.

Thanks for the quick reply! ExportHeadersFootersMode = ExportHeadersFootersMode.FirstSectionHeaderLastSectionFooter does the trick!

We want ant to use the resulting HTML as a body of an email. Are there any recommendations or suggestions that would be good to consider in this case?

@profiler There are no special recommendations for HTML that will be used as an e-mail body. Just keep in mind that HTML is quite different format than MS Word document and it is impossible to provide 100% fidelity after converting MS Word document to HTML.