Free Support Forum - aspose.com

Word Styles and HTML

Hi

Creating a text first as HTML and I then save to a Word document. Everything is fine but my need would be to tag the HTML code so that the Word document could treat different parts of the text with different styles. This would simplify the formatting of the word document. Is there a way of tagging the text in the HTML code so that Word would understand the texts with different styles? Now all text is with the 'Normal' style.

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. Yes, of course you can achieve this. For example see the following HTML:

<html>

<head>

<style type="text/css">

.myStyle { font-size:16pt; font-weight:bold; font-style:italic }

</style>

</head>

<body>

<div>

<h1><span>This is heading 1 style</span></h1>

<h2><span>This is heading 2 style</span></h2>

<h3><span>This is heading 3 style</span></h3>

<p><span>This is normal style</span></p>

<p class="myStyle"><span>This is my custom style</span></p>

</div>

</body>

</html>

Hope this helps.

Best regards.

ok, so when saved to word the

tag will be converted to a Word style named 'myStyle' and e.g. <font style="..." will not? What about a <div style=... ? What I think my problem then is that the following code removes all formatting when I only want to remove the hyperlinks (got this code from someone from Aspose some time ago. How do I remove just the hyperlink formatting (do not want the hyperlinks to be included in the Word document)?

Public Function removeHyperlinkFormatting(ByRef wordDoc As Aspose.Words.Document) As Boolean

Try

If Not wordDoc Is Nothing Then

Dim fieldStarts As NodeCollection = wordDoc.GetChildNodes(NodeType.FieldStart, True)

Dim nodesForRemoval As ArrayList = New ArrayList

Dim fieldStart As Fields.FieldStart

For Each fieldStart In fieldStarts

If fieldStart.FieldType = Fields.FieldType.FieldHyperlink Then

Dim node As Node = fieldStart

While node.NodeType <> NodeType.FieldSeparator

nodesForRemoval.Add(node)

node = node.NextSibling

End While

nodesForRemoval.Add(node)

While node.NodeType <> NodeType.FieldEnd

If node.NodeType = NodeType.Run Then

CType(node, Run).Font.ClearFormatting()

End If

node = node.NextSibling

End While

nodesForRemoval.Add(node)

End If

Next

For Each node As Node In nodesForRemoval

node.Remove()

Next

End If

Return True

Catch ex As Exception

QARoutinesLog.doLog(ex.ToString, TYPEERRORWORD)

Return False

End Try

End Function

End Class

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. The same can be accepted to spans, if you would like to use Character Styles.

<html>

<head>

<style type="text/css">

.myStyle { font-size:16pt; font-weight:bold; font-style:italic }

.myCharStyle { font-size:10pt; font-weight:bold; }

</style>

</head>

<body>

<div>

<h1><span>This is heading 1 style</span></h1>

<h2><span>This is heading 2 style</span></h2>

<h3><span>This is heading 3 style</span></h3>

<p>

<span>This is normal style </span>

<span class="myCharStyle">This is my character style</span>

</p>

<p class="myStyle"><span>This is my custom style</span></p>

</div>

</body>

</html>

Regarding hyperlinks, if you need to insert hyperlinks as a simple text, you can just remove hyperlinks from your HTML. For example, you can try using regular expressions to achieve this. Please see the following code:

// Read HTML string.

string html = File.ReadAllText(@"Test001\in.html");

// Replace hyperlinks in the HTML string with simple text.

Regex regex = new Regex("]*>(.*)", RegexOptions.Singleline | RegexOptions.IgnoreCase);

html = regex.Replace(html, "$1");

// Get HTML bytes and create stream.

byte[] htmlBytes = Encoding.UTF8.GetBytes(html);

MemoryStream htmlStream = new MemoryStream(htmlBytes);

// Create document from stream.

Document doc = new Document(htmlStream);

// Save output document.

doc.Save(@"Test001\out.doc");

Hope this helps.

Best regards.

Got the regex stuff working, no doubt I’ll get the styles working as well. The key seems to be the tag. Thanks once again, your support is excellent!