Generating html from a word document

Hello,
I’m calling Save() on the document class in order to generate some html. I notice, however, that I don’t seem to be getting any representation of hidden fields. In particular, I have some {TA … } fields that I don’t see in the html. Is that what you expect? Does the Save method ignore these hidden fields when creating the html?
Thanks in advance,
Randall

Hi
Thanks for your request. Please attach your sample document. Also see the following document. It specifies what MS Word document features are supported when exporting to HTML and PDF
https://docs.aspose.com/words/net/supported-document-formats/
Best regards.

Hello,
Thank you for all of your help. Attached is a document that contains several TA tags. From your spreadsheet, it looks like saving to html function does not support TA tags. But I have a new concern, however. From that document, the generated html does not look correct. See the example, attached in an html file to this post as well, to get an idea of where the html goes bad. If you stretch the html into a non-word wrapped file, my logging tells me that it is breaking at column or position 15385.
The problem, starting at position 15385, is that in the third
tag there is an improperly nested tags. The document still displays in a browser, but crashes when treated as xml. I’m manipulating the html produced by aspose in C# and thus need properly nested tags.
Can this problem be corrected?
Thanks for any insight…
-Randall

I see that I cannot attach two files to a post. So attached to this reply is the word document that creates that html error explained in my previous post.

Hi
Thanks for additional information. Please reattach your word document.
Best regrds.

See the attached word document.

Hi
Thanks for the additional information. I have tried to convert your file to HTML and I think HTML looks fine. Please tell me how I can reproduce your problem. Provide me some code that will allow me to do this.
Best regards.

Hello,
The code to reproduce this issue is simly calling the SaveAsHTML(memoryStream) method on the Document object.
When I look at the html, it’s in a string form. So a code snippet from getting the html to a string:

doc.SaveAsHTML(memoryStream);
memoryStream.Flush();
memoryStream.Seek(0, SeekOrigin.Begin);
StreamReader sr = new StreamReader(memoryStream);
string data = sr.ReadToEnd();

You can see the html that is not xhtml in the following html snippet:

POINT I PLAINTIFFS MISAPPLY THE TEST FOR DEFAMATION

CASES

10

Any help is appreciated, attached you’ll find the document that I’m using.

Hi
Thanks for additional information. I tried to convert your document to XHTML and it seems that all works fine on my side. I use the following code.

Document doc = new Document(@"295_99953_AppTech\in.doc");
MemoryStream memoryStream = new MemoryStream();
// save HTML as XHTML
doc.SaveOptions.HtmlExportXhtmlTransitional = true;
doc.Save(memoryStream, SaveFormat.Html);
memoryStream.Flush();
memoryStream.Seek(0, SeekOrigin.Begin);
StreamReader sr = new StreamReader(memoryStream);
string data = sr.ReadToEnd();
// read xml
XmlDocument xml = new XmlDocument();
xml.LoadXml(data);

Also i used the latest version of Aspose.Words (4.4.1.0).
I hope that this will help you.
Best regards.