Free Support Forum - aspose.com

Saving HTML page to word - then loading into ASPOSE

I am saving an HTML page to word:



string strCSSText;

string strPath = Server.MapPath(@"/Styles/Agenda.css");

strPath = Request.PhysicalApplicationPath + “Styles/Minutes.css”;

System.IO.StreamReader objReader = new System.IO.StreamReader(strPath);



strCSSText = objReader.ReadToEnd();

strCSSText = strCSSText.Replace("\n","");

strCSSText = strCSSText.Replace("\r","");

strCSSText = strCSSText.Replace("\t","");

strCSSText =
“<style type=“text/css”>” + strCSSText + “”;



objReader.Close();

Response.Clear();

Response.AddHeader(“content-disposition”,
“attachment;filename=Minutes.doc”);

Response.Charset = “”;

Response.Cache.SetCacheability(HttpCacheability.NoCache);

Response.ContentType = “application/vnd.word”;



string strMinutes = “”;

strMinutes =
edtMinutes.Xhtml.Replace("<link media=‘screen’ rel=‘stylesheet’
href=‘Styles/Minutes.css’ type=‘text/css’ media’all’ />","");

strMinutes =
strMinutes.Replace("<link media=‘screen’ href=“Styles/Minutes.css”
type=“text/css” rel=“stylesheet” media?all?="" />","");



int intPackageID = Convert.ToInt32(ViewState[“PackageID”]);

objPM = new PackageManager(intPackageID);

string strMinutesCover = objPM.GetMinutesCover();


Response.Write("" + strCSSText + strMinutesCover + “” + strMinutes + “”);

Response.End();



-----



I can load this document into word, edit it, and do whatever I like.



Now when I load this object into ASPOSE it will give me:



--------------------------

File Name: Minutes1.doc Size: 20389 bytes Type: Microsoft Word Document File Uploaded: December 6, 2005 9:49:40 AM
-------


Is there a way around this?

I wrapped the part I thought it was erroring at in a Try…Catch and here’s the results:



Here is where the error happens:



try

{

docWord = new Aspose.Word.Document(this.File.FullName);

docWord.Save(strSaveFileName, Aspose.Word.SaveFormat.FormatHtml);

docWord = null;

}

catch(Exception ex)

{

}



It errors on the docWord = new Aspose.Word.Document.this.File.FullName);



Error Message:{“Input string was not in a correct format.” }



I do a check before this to make sure the file actually exists, which it does.


AFter further investigation:



DOcuments saved from the above code get saved as a Web Page format in
Word (HTML Document). It seems ASPOSE doesn’t like this format. Is
there a different way of loading this type of document format?
For the time being we are telling our users to re-save (save as) the
documents as an actual word document, which works perfectly. But that’s
a short-term fix that we would like to avoid at all costs.



We don’t like telling users to: “oh to do this, you need to do
this”. It should just work. Is there another way of importing
this document into ASPOSE? We will not use any API calls to the Word
object as that is very inefficient, and can crash at any time seeing
word is as stable as a house built on chopsticks.








Please attach the document in question, so that we could investigate the case. Don't worry about privacy - the forums are configured so that the attachment file would be visible only to Aspose team members.

To put it simply:

At the moment Aspose.Word can only reliably load HTML that was created by Aspose.Word or simpler HTML. Aspose.Word cannot really load HTML produced by MS Word because it produces pretty complex HTML.

In your case it throws probably because some measurement is specified in percent or on inches or whatever. Aspose.Word supports only measurements specified in pixels in HTML.

We will gradually improve HTML import to allow full HTML and CSS import. There is no hard estimate on this, it will evolve over the following months.

Thanks for the replies.



FOr now we can survive with the current solution of telling customers to save it as a word document and not a web page.



I’ll be waiting for your next version though :smiley: