Hello! In the attached zip file (example.zip), I’ve got a docx file that contains a simple table with a caption, and the output HTML file that’s a result of a simple HTML conversion.
When viewed in Word, the table caption appears to have the named style “Caption” applied. (The Modify menu for that style shows that’s using Font: Cambria, bold, 9 point.)
When converted to HTML, the caption text is surrounded in spans that specify the font, font weight and size, but the name of the style applied (“Caption”) is lost, and in fact it seems to be impossible to tell that the styling was applied via the named style and not by manually setting bold, 9 point Cambria:
Table 1
So, the question is: is there some configuration option I can specify that would allow me to determine that the applied text styling came from an actual Text Style, rather than just having the font, font weight and size applied directly?
Alternatively, is there a configuration option that will just drop all named styles (resulting in the caption text “Table 1” coming through as plain text?)
In the source .docx file, the text appears to be marked with the Text Style, it appears that the translation from that named style to the font/weight/size style directives is happening in the HTML conversion:
Table
Here are the configuration options I’m currently using for the conversion:
HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML)
options.setUseAntiAliasing(true)
options.setUseHighQualityRendering(true)
options.setScaleImageToShapeSize(false)
options.setImageSavingCallback(FilenameSanitizingSavingCallback)
options.setExportOriginalUrlForLinkedImages(true)
options.setExportListLabels(ExportListLabels.BY_HTML_TAGS)
Thanks!!
Dave