Free Support Forum - aspose.com

Random SPAN tags breaking words and paragraphs in unexplained locations

Hi,

Using Aspose Word for Java to parse and then generate HTML from a MS Word doc. Using the attached .doc, the following snippet of HTML is generated:

The proposed Optimum? AcreMax?1 system would be unique in the industry because it would eliminate the need for separate refuge for which of the following?? 24-Feb-2010-3. This is new text.? What will it look like?

Notice how the ‘3’ is separated into its own span, as well as the following ". ".

Is this a problem with the .doc file or is this something that Aspose is doing? Is there a way to reduce or eliminate these issues?

Regards,
Dave Wolf

Hi,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. This can occur because text in your document consists of multiple Runs. Usually this occurs when you edit document multiple times in MS Word.

There is JoinRunsWithSameFormatting method, which concatenates runs with same formatting. So you can try just calling this method before saving document as HTML.

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net-and-java/com/aspose/words/document.html#joinRunsWithSameFormatting()

Best regards.

Works great. Thanks!