Convert Word to HTML using classic ASP

I am sure many people have asked this question before, but I couldn’t find the answer I want…

We are using classic ASP and trying to do this;

  1. open saved Word doc

  2. convert the Word to HTML and save the html file to our some dir

  3. open the html file, read, and extract contents as a string in ASP

I know how to do 1 in ASP;

Dim helper
Set helper = CreateObject("Aspose.Words.ComHelper")
Dim doc
Set doc = helper.Open(mydir & Filename & "." & fileExtension)

But I am not sure how to do 2 and 3.

Especially save the html as file not stream.

If there is sample code already posted which does this, please let me know.

Thank you for your help!!

Also I tried like this;

doc.Save_4 mydir & Filename & “.html”, 3

This does save file with html extension, but it is xml.

If anyone can help me out, I will be really appriciative.

doc.Save_4 mydir & Filename & ".html", 4

https://reference.aspose.com/words/net/aspose.words/saveformat

Thank you so much for the help.

If I may, I’d like to ask another question.

When I converted a doc to html, the bullets were converted to some numbers. (not consecutive numbers, though. It goes like 1. 1. 3…)

Is that because I am using evaluation version??

That should not occur. Bullets should remain bullets after conversion. Please attach the sample document. I will then try to reproduce this problem and include it in our fix plan.

Here you go.

And here is the converted html saved in txt format.

Thanks for providing the files. I have reproduced the problem. It seems to be caused by the fact that the lists in question are in fact numbered lists, but with bulleted number style. Aspose.Words seemingly gets confused by it when converting them to HTML. I have logged this problem as issue #3325. We will try to fix it in the next release. I will post a notification here as soon as it will be done. Meanwhile you can try to correct the lists, making them bulleted instead of numbered. That will cure the problem.

Best regards,

Ok, thanks.

I thouht the bullets I used were standard ones since I created the sample doc just using one of resume templates in MS Word 2000.

For the test purpose, I created lists with bunch of bullets styles (like regular circle one, square, arrow, check mark …) and all of those were converted to a single circle bullet. Is that the way Aspose.Words works?

Yes, Aspose.Words behavior is a little different from MS Word’s in list conversion. MS Word tries to preserve the appearance of the document and therefore converts the list to paragraphs. We are trying to preserve the essence of formatting. Therefore, the lists in document are converted to HTML lists. But HTML lists have limited set of bullet/number formatting. This set is

DISC | CIRCLE | SQUARE | 1 | A | a | I | i

All other types of bullets are downgraded to these ones.

Hope that explains it,

Do you have any updates on issue #3325 ?

I just uploaded the latest version of Aspose.Word (4.4.2.0), but the problem still exists.

Thank you for your reply.

Hi

Unfortunately, this issue is still opened. We will notify you as soon as it is done. Thanks for your patience.

Best regards.

The issues you have found earlier (filed as 3325) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.