Styles in MHTML output

I am converting a word document to MHTML for a client’s use, and they are telling me the following:

“The tool does not seem to be outputting the same classes/structure as it does when you just do “save as”. The other documents all contained class definitions of MsoListNumberX to show how deeply nested a list was, for example.”

It seems that when you use the Word “Save As” function to save a document out as MHTML pretty much all items get classes, but this same functionality does not happen with an Aspose.Words conversion. My client needs SOME class in place so that they can work with some after-conversion styling.

Is there a way for me to trigger the tooling to add these classes? I am running Aspose.Words for .NET 15.2.0.0

The ZIP has 2 files:

  • EP4_1_2.mht | This is the file created by Aspose.Words
  • EP4_1_2_Word.mht | This is a file created from Word Save As.

EP4_1_2.zip (101.9 KB)

Thanks,
Ross

@rritchey-1,

Thanks for your inquiry. Please ZIP and attach your input Word document here for testing. We will investigate the issue on our side and provide you more information.

Hi Tahir,

Here is the input word document.

EP4_1_2 (2).zip (62.8 KB)

Thanks.
Ross

@rritchey-1,

Thanks for sharing the document. The output documents generated by Aspose.Words and MS Word are same when they are opened in browser.

Could you please share the detail of your use case in which you need these classes/structure? We will then log the requested feature in our issue tracking system accordingly. Thanks for your cooperation.

Hi Tahir,

My client is processing these documents and doing some restyling. They built their tool to process the file as it comes out of Word using the Save As. Unfortunately - we need to automate the process of creating these MHTML files on a server - so I added this step to the other processes we are already doing with the hope that the output would work for them.

They need classes in the HTML in order to target things properly when they are processing and are less concerned with how it looks visually as they are how the structure is actually built in the code. The only critical piece that they have noted to me is the classes.

If I can’t get classes in the Aspose code I may have to go a different direction and set up a process using the Microsoft Interop services to do a save-as on a local machine but would obviously prefer to avoid this if possible.

Thanks,
Ross

@rritchey-1,

Thanks for sharing the detail. We have logged this feature request as WORDSNET-15670 in our issue tracking system. Our product team will look into the possibility of implementation of this feature. Once we have any information about this feature, we will update you via this forum thread.

We apologize for your inconvenience.

Hi Tahir,

Thanks for the update. My client is looking for a faster and more guaranteed resolution. They already have the appropriate licensing to pay for features to be added as we have previously gone down this path.

Can you estimate the costs and timeline to add this functionality into the .NET Aspose.Words libraries? They will want the classes that are added to match what MS Word generates.

Thanks,
Ross

@rritchey-1,

Thanks for your inquiry. We regret to share with you that this feature has been postponed. Copying MS Word’s HTML export is a complex task, which is not going to be implemented in the near future. We will inform you via this forum thread as soon as there are any further developments.

We apologize for your inconvenience.

Hi Tahir,

Is this true even if we pay to have it added now?

Thanks,
Ross

@rritchey-1,

Thanks for your inquiry. We are in communication with our product team about your query. We will inform you via this forum thread once there is any update on this.

Hi Tahir,

Any updates on how much it would cost to pay to have this functionality built in now?

Thanks,
Ross

@rritchey-1,

Thanks for your inquiry. We are still in communication with our product team on this feature. Once we receive response from them, we will then be able to share the cost and timeline of this feature.

Thanks for your patience.

@rritchey-1,

Thanks for your patience. Normally when an issue is reported by a customer it is added to the pool of current issues being worked on by our developers and is analysed in a timely manner. However due to the nature of some bugs and the number of features we are working on, this doesn’t always mean we can fix every bug within a short time after it’s reported.

We do understand your situation; however, I am afraid, at the moment there is no ETA available for your issue. This is a complex issue and MS Word HTML format is not documented. The implementation of this feature (WORDSNET-15670) has been postponed.

We are really very sorry for your inconvenience.