Error converting DOCX to HTML using Aspose.words Java

Hi,

Greetings.

We use Aspose.words Java to convert docx file to html.

When we try to convert this docx file to html, we were not able to convert and getting an error.
image.png (2.5 KB)

Please suggest ways we can mitigate this and let us know your valuable suggestions to resolve this issue.

Please use the attached docx file.
testFile.docx (48.6 KB)

@EdwinPearson,

After an initial test with the licensed latest (21.9) version of Aspose.Words for Java, we were unable to reproduce this problem on our end (see Aspose.Words 21.9 generated output: awjava-21.9.zip (8.0 KB)). So, we suggest you to please upgrade to the latest (21.9) version of Aspose.Words for Java.

Sir,

We upgraded to Aspose.Words for Java 21.9 and still we are getting the error “addition of a duplicate key to a dictionary” when we call the code below:

Blockquote
final Document document = new Document(file.getInputStream());
document.acceptAllRevisions();
//Get page count
document.updatePageLayout();
int pageCount = document.getPageCount();
// remove paragraphs with zero line spacing
for (Paragraph paragraph : document.getFirstSection().getBody().getParagraphs())
if (paragraph.getParagraphFormat().getLineSpacing() == 0)
paragraph.remove();
final HtmlSaveOptions options = new HtmlSaveOptions(HTML);
options.setExportFontsAsBase64(true);
options.setExportImagesAsBase64(true);
options.setPrettyFormat(true);
options.setScaleImageToShapeSize(true);
options.setExportPageSetup(true);
document.save(String.valueOf(targetLocation), options);
document.cleanup();

Please share your valuable suggestions to resolve this issue.

Best Regards
Moses

@mosesm1,

Thanks for providing the code. We have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-22810. We will further look into the details of this problem and will keep you updated here on the status of correction. We apologize for your inconvenience.

1 Like

@mosesm1,

Regarding WORDSNET-22810, it is to inform you that we have completed the analysis of this issue and found the bug in HTML module of Aspose.Words. The HTML writer uses information from the page layout engine; but, since the code modifies the document, the page layout gets expired. As a workaround, you can update the page layout once again by invoking Document.UpdatePageLayout() method just before saving the document. We will inform you via this thread as soon as this issue will get resolved in future.

The issues you have found earlier (filed as WORDSNET-22810) have been fixed in this Aspose.Words for Java 22.6 update.