Hello,
We’re facing a situation when converting a docx file to HTML paragraph by paragraph - each letter from the output is surrounded by a separate <span>
tag .
e.g.: <span style=\"font-family:Arial; font-size:11pt\">The</span><span style=\"font-family:Arial; font-size:11pt; letter-spacing:0.05pt\"> </span><span style=\"font-family:Arial; font-size:11pt\">ove</span><span style=\"font-family:Arial; font-size:11pt; letter-spacing:0.05pt\">r</span>...
In this way, the size of HTML output ends up having 2MB in size when transforming a 50 pages word document.
By analyzing the output, I’ve observed that the only difference is the letter-spacing
property.
Is there any way to remove the letter-spacing
from the HTML and then to combine the <span>
tags with the same formatting?
I’ve tried paragraph.joinRunsWithSameFormatting();
but doesn’t help much in my case.
The code: tc-aspose-evaluation.zip (74.3 KB)
The name of the word document is CharacterSpacingIssue.docx.
Library version: Aspose Words for Java 21.1.