We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Converting from *.doc to *.html breaks a word into two spans

Hi,
I need to parse a doc file into a HTMLfile.
I have created a Document object as follows :

Document document = new Document(file1);
document.acceptAllRevisions();
ByteArrayOutputStream outStream1 =new ByteArrayOutputStream();
HtmlSaveOptions options = new HtmlSaveOptions(); options.setSaveFormat(SaveFormat.HTML);
options.setExportTextInputFormFieldAsText(true);
options.setImagesFolder(imagesDir.getPath());
options.setEncoding(java.nio.charset.Charset.forName(“UTF-8”));
options.setPrettyFormat(true);
document.save(outStream1, options);


On saving the resulting outputStream to a file, certain words in the original document are broken into different spans in the HTML.

The following was the expected HTML output :


This Services Schedule is effective as of the date signed by you in the signature block on the last page of this Services Schedule.



However the HTML generated by aspose is :


This Services Schedule is effective as of the date signed by you in the signature block on the last page of this Se


rvices Schedule.


A lot of my code depends on the document being parsed correctly by aspose, and thus this is a critical bug for my application.
Any help with the same will be highly appreciated.

Regards,
Siddarth

Hi Siddarth,

Thanks for your inquiry. Please try using the Document.JoinRunsWithSameFormatting Method to joins runs with same formatting in all paragraphs of the document before converting to HTML format. In case the problem still remains, please attach your input Word document and output HTML file showing the undesired behaviour here for testing. I will investigate the issue on my side and provide you more information.

Best regards,

Hi Awais,
This solved the issue.
Thanks a lot, this made my day!

Regards,
Siddarth

Hi Siddarth,


Thanks for your feedback. It’s great you were able to find what you were looking for. Please let us know any time you have any further queries.

Best regards,