Free Support Forum - aspose.com

Excessive <span> in html generation

Hi there,

I’m developing a tool that uses aspose.words to convert word documents to html. After sending my first working version to our testing team, they reported that the html output had been generated with a very large number of 's, often in rather odd places for example:

between wwords

the html code that is generated will be frequently modified by other members of my team, so it is very desirable that it be as human readable as possible. Are there any remedies for this?

I have a feeling that the issue with s are directly related to “runs” within a “paragraph” being (perhaps unnecessarily) fragmented. However, I haven’t figured out any easy way to merge similar runs, or even if it will alter the layout.

Thanks,
Dave

Hi Dave,

Thanks for your request. This can occur because text in your document consists of multiple Runs. Usually this occurs when you edit document multiple times in MS Word.

There is JoinRunsWithSameFormatting method, which concatenates runs with same formatting. So you can try just calling this method before saving document as HTML.

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net-and-java/aspose.words.document.joinrunswithsameformatting.html

Best regards.

Ahh, that is just what I am looking for. Thanks for the help!