JAVA DOCX TO HTML BUG

Hi,
look images:
This is docx display:

This is html display:

@wiy666 Could you please attach your input and output documents here for testing? We will check the issue and provide you more information.

@alexey.noskov test_docx.zip (13.7 KB)

@wiy666 Thank you for additional information. As I can see Aspose.Words produces HTML that looks the same as HTML produced by MS Word.
If you need HTML that look exactly as MS Word, you can try using HtmlFixed save format instead of flow HTML:

Document doc = new Document(@"C:\Temp\in.docx");
doc.Save(@"C:\Temp\out_fixed.html", SaveFormat.HtmlFixed);

out.zip (7.5 KB)

@alexey.noskov Thank you for your help.But i want to save images other path.
This is my previous code:

HtmlSaveOptions options = new HtmlSaveOptions();
options.setImageResolution(255);
options.setScaleImageToShapeSize(false);
options.setImagesFolder(new File(datePath));
Long startTs = System.currentTimeMillis();
options.setImageSavingCallback(new SavedImageRename(basePath, startTs.toString()));
String htmlStr = doc.toString(options);

I can’t find a method that is compatible with both. For example:

doc.Save(@"C:\Temp\out_fixed.html", HtmlSaveOptions,SaveFormat.HtmlFixed);

@wiy666 You can specify options using HtmlFixedSaveOptions. See ResourcesFolder and ResourcesFolderAlias properties. Also, in your code you are using IImageSavingCallback in your code, you can achieve the same in Fixed html using IResourceSavingCallback.

However, in your code you are using Document.toString method, which can be used with TEXT and flow HTML formats. So you cannot pass HtmlFixedSaveOptions into this method. If you need to convert your document to HtmlField string, you can use code like the following:

HtmlFixedSaveOptions opt = new HtmlFixedSaveOptions();
// Here you can specify your options. 
// For demonstration purposes all resources are embedded into the HTML.
opt.setExportEmbeddedSvg(true);
opt.setExportEmbeddedImages(true);
opt.setExportEmbeddedCss(true);
opt.setExportEmbeddedFonts(true);

ByteArrayOutputStream dstStream = new ByteArrayOutputStream();
doc.save(dstStream, opt);

byte[] htmlBytes = dstStream.toByteArray();
String html = new String(htmlBytes, StandardCharsets.UTF_8);

@alexey.noskov Thank you for your help.
I think i found it!
Thank you very much. :slight_smile:

1 Like