Word document formatting issues

Hello Team,

When we view a word document in HTML after getting converted from word to HTML using Aspose, all the table of content is disturbed and page numbers are not visible. The table of contents heading has spacing issues as well. Please look into this issue on priority as our enterprise clients are facing troubles.

Please refer to the original file and downloaded file for your reference.

Files.zip (44.7 KB)

Thank you!

@amit.tripathi,

But, when you ‘Save As’ this ‘Original file.docx’ document to HTML format by using MS Word 2019, you will observe the same behavior. Please see the following MS Word 2019 generated HTML file:

So, this seems to be an expected behavior. Please let me know if I can be of any further assistance.

Hi Awais,

When we convert the MS Word 2019 document to HTML, there are no errors encountered as shown in the below screenshot.

image.png (84.4 KB)

But, when we download the same document from our software after converting it into HTML using Aspose , we notice the issues as shown below.

image.png (88.9 KB)

I have also attached both the files for your reference by the name “Original Document” and “Downloaded document” respectively.

Please let me know in case we must schedule a quick call to discuss this in detail and show in our product as well. This needs to be fixed on priority.

Thanks & Regards

Documents.zip (41.7 KB)

@amit.tripathi,

Can you please try converting to HTML by using the following code and see how it goes on your end?

Document doc = new Document("E:\\temp\\Documents\\Original Document.docx");

HtmlSaveOptions opts = new HtmlSaveOptions(SaveFormat.Html);
opts.PrettyFormat = true;
opts.ExportListLabels = ExportListLabels.ByHtmlTags;

doc.Save("E:\\temp\\Documents\\19.9.html", opts);

In case the problem still remains, please tell are you using any other third party software to process Aspose.Words generated HTML and then download it?

Please provide complete steps that we can follow on our end along with a simplified console application to be able to observe the exact same behavior in downloaded file on our end. Thanks for your cooperation.

Hello Awais,

we have tried it but it does not resolve the issue. As we mentioned issue in details & attached supporting documents as ZIP. i am explaining issue again, when a .docx document that has table of content with numbering has been converted to html. the converted html doc does not have numbers in TOC.after that when this html again converted in to .docx file. there are two issues: 1) there is no numbering in TOC as converted html does not have it. 2) there is no auto rearrangement of numbers in docx. we have attached zip file in the above question.

Thanks,
David

@amit.tripathi

We are checking this scenario and will get back to you soon.

@amit.tripathi,

We tested the scenario and have managed to reproduce the same problem on our end. For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-19329. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-19329) have been fixed in this Aspose.Words for .NET 23.2 update also available on NuGet.