Doc to XML Conversion Error

Hi team,

During conversion from doc to xml, some of the sentences are getting split for unknown reasons.

  1. Kindly share us the reason for the split of sentence.
  2. Is there a way to avoid the split ?

I have previously tried the following methods :

  1. Checking with different types of SaveFormats available in Word to XML conversion
    SaveFormat | Aspose.Words for Java
  2. Protected the document by converting it into read – only mode.
    Aspose.Words Features Missing in Apache POI|Aspose.Words for Java
  3. Set Language of the Document.
  4. Set Language and Region - setLocaleId of the document using builder class to en-US instead of de-DE.
    Font | Aspose.Words for Java

But, failed to produce the expected output.

Attached the word document, defect xml output, screenshots to support the issue.

Requiring a work around solution to handle the bug at the earliest.

Priya Dharshini J P (150.9 KB)
Capture.PNG (9.8 KB)
Capture_xml.PNG (11.3 KB)


Thanks for your inquiry. Please note Aspose.Words mimics MS Word behavior, and it seems Aspose.Words for Java is producing expected behavior. If you convert your word document to XML, then you will notice same results.