Remove Space between Characters using Java | DOCX to HTML conversion


We’re facing a situation when converting a docx file to HTML paragraph by paragraph - each letter from the output is surrounded by a separate <span> tag .

e.g.: <span style=\"font-family:Arial; font-size:11pt\">The</span><span style=\"font-family:Arial; font-size:11pt; letter-spacing:0.05pt\"> </span><span style=\"font-family:Arial; font-size:11pt\">ove</span><span style=\"font-family:Arial; font-size:11pt; letter-spacing:0.05pt\">r</span>...
In this way, the size of HTML output ends up having 2MB in size when transforming a 50 pages word document.

By analyzing the output, I’ve observed that the only difference is the letter-spacing property.

Is there any way to remove the letter-spacing from the HTML and then to combine the <span> tags with the same formatting?

I’ve tried paragraph.joinRunsWithSameFormatting(); but doesn’t help much in my case.

The code: (74.3 KB)

The name of the word document is CharacterSpacingIssue.docx.

Library version: Aspose Words for Java 21.1.

I’ve figured out how to remove and join spans for this specific use case:

for (Run run : paragraph.getRuns()) {



It is nice to hear from you that your problem has been solved. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.