Free Support Forum - aspose.com

Html import using Aspose.Words document object

Hi,

I am using Aspose.Words document object to convert HTML to Word.

Dim doc As Document = New Document(stream, MyDir)

When I run the above code, I'm getting the following error,

'System.FormatException: Input string was not in a correct format.'

Note:I tried the same with trial version of Aspose.Word document object. It worked fine without any error.

Regards,

Ganeshpandi.M

IAEA, Vienna, Austria.


This message was posted using Aspose.Live 2 Forum

Hi

Thanks for your inquiry. Could you please attach your input file for testing? I will investigate this problem and provide you more information.

Best regards,

Hi,

Please find the attachment for the HTML file and the word document file, which has the error detail.

Regards,
Ganeshpandi.M

Hi

Thanks for your inquiry. I can’t reproduce this problem using on the latest version of Aspose.Words. Please try using the latest version of Aspose.Words.
http://www.aspose.com/Community/Files/51/aspose.words/category1188.aspx

Best regards.

Thanks for the response.

It's working with the lattest version.

Regards,

Ganeshpandi.M

Hi,

Please find the attachment for .html file and .doc file.

Dim stream As Stream = File.OpenRead(sHTMLfileName)

Dim doc As Document = New Document(stream, MyDir)

stream.Close()

doc.Save(MyDir & "testOut.doc")

Using the summaryTest.html file, I was trying to generate a word document. In the output testOut.doc word document, table doesn't display properly.

Could you please check the same and let me know as soon as possible? All other things are working fine.

Hi

Thank you for your request. I managed to reproduce this problem and I created issue # 4298 in out defect database. Please expect a reply before the next hotfix (within 2-3 weeks). We might just fix it by then or provide more information

Best regards.

Hi,

Thanks for the response.

Could you please tell me any work around for the time being for the same? Or May I know what's the problem with the table, so I will try to find out any other work around.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Because it's very urgent.

Regards,
Ganeshpandi.M

Hi

I figured out the reason of this problem is puddings and borders color. If you remove these from your html then conversion will work fine.

PADDING-RIGHT: 5.4pt; PADDING-LEFT: 5.4pt; PADDING-BOTTOM: 0cm; PADDING-TOP: 0cm; BORDER-RIGHT: #e6e6e6;

You can remove these styles programmaticaly using the following code.

//Read html

string inputHtml = File.ReadAllText(@"462_109214_John_C_Stewart\in.html");

//Create regex to find paddigs

Regex regexPadding = new Regex("PADDING.*?;");

MatchCollection matchesPadding = regexPadding.Matches(inputHtml);

foreach (Match match in matchesPadding)

{

string matchedValue = match.Value;

//Remove paddings

inputHtml = inputHtml.Replace(matchedValue, "");

}

//Create regex to find BORDER-TOP: #fffff; etc

Regex regexBorder = new Regex("BORDER(-TOP|-BOTTOM|-RIGHT|-LEFT): #.*?;");

MatchCollection matchesBorder = regexBorder.Matches(inputHtml);

foreach (Match match in matchesBorder)

{

string matchedValue = match.Value;

inputHtml = inputHtml.Replace(matchedValue, "");

}

//Create memorystream and load html

MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(inputHtml));

//Create document

Document doc = new Document(stream);

//Save document

doc.Save("out.doc");

I hope that this will help you.

Best regards.

Thank you very much for your valuable time and help.

It's working.

Thanks with Regards,

Ganeshpandi.M

Hi,

I have a table in the HTML. When I export it as a word, table borders for some cells are not displaying.

Please find the attachment for the HTML and Produced word document.

Note: It's very urgent. Please let me know, how to solve this problem.

Regards,

Ganeshpandi.M

Hi

Thanks for your request. I managed to reproduce the problem. I have created the issue # 4325 in our defect database. Please expect a reply before the next hotfix (within 2-3 weeks).

As workaround you can try using the following code.

//Open input html

Document doc = new Document("test.html");

//Get collection of cells

NodeCollection cellsCollection = doc.GetChildNodes(NodeType.Cell, true);

//Loop through collection

foreach (Cell cell in cellsCollection)

{

if (cell.CellFormat.VerticalMerge == CellMerge.Previous)

{

//Set borders

cell.CellFormat.Borders[BorderType.Right].LineStyle = LineStyle.Single;

cell.CellFormat.Borders[BorderType.Bottom].LineStyle = LineStyle.Single;

}

}

//Save output

doc.Save("out.doc");

I hope that this will help you.

Best regards.

Thanks.

It's working.

Regards,

Ganeshpandi.M

The issues you have found earlier (filed as 4325) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as WORDSNET-1514) have been fixed in this .NET update and in this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.