Hi
Creating a text first as HTML and I then save to a Word document. Everything is fine but my need would be to tag the HTML code so that the Word document could treat different parts of the text with different styles. This would simplify the formatting of the word document. Is there a way of tagging the text in the HTML code so that Word would understand the texts with different styles? Now all text is with the ‘Normal’ style.
Hi
Thanks for your inquiry. Yes, of course you can achieve this. For example see the following HTML:
<html>
<head>
<style type="text/css">
.myStyle {
font-size: 16pt;
font-weight: bold;
font-style: italic
}
</style>
</head>
<body>
<div>
<h1><span>This is heading 1 style</span></h1>
<h2><span>This is heading 2 style</span></h2>
<h3><span>This is heading 3 style</span></h3>
<p><span>This is normal style</span></p>
<p class="myStyle"><span>This is my custom style</span></p>
</div>
</body>
</html>
Hope this helps.
Best regards.
ok, so when saved to word the
tag will be converted to a Word style named ‘myStyle’ and e.g. <font style="..."
will not? What about a <div style=...
? What I think my problem then is that the following code removes all formatting when I only want to remove the hyperlinks (got this code from someone from Aspose some time ago. How do I remove just the hyperlink formatting (do not want the hyperlinks to be included in the Word document)?
Public Function removeHyperlinkFormatting(ByRef wordDoc As Aspose.Words.Document) As Boolean
Try
If Not wordDoc Is Nothing Then
Dim fieldStarts As NodeCollection = wordDoc.GetChildNodes(NodeType.FieldStart, True)
Dim nodesForRemoval As ArrayList = New ArrayList
Dim fieldStart As Fields.FieldStart
For Each fieldStart In fieldStarts
If fieldStart.FieldType = Fields.FieldType.FieldHyperlink Then
Dim node As Node = fieldStart
While node.NodeType <> NodeType.FieldSeparator
nodesForRemoval.Add(node)
node = node.NextSibling
End While
nodesForRemoval.Add(node)
While node.NodeType <> NodeType.FieldEnd
If node.NodeType = NodeType.Run Then
CType(node, Run).Font.ClearFormatting()
End If
node = node.NextSibling
End While
nodesForRemoval.Add(node)
End If
Next
For Each node As Node In nodesForRemoval
node.Remove()
Next
End If
Return True
Catch ex As Exception
QARoutinesLog.doLog(ex.ToString, TYPEERRORWORD)
Return False
End Try
End Function
Hi
Thanks for your request. The same can be accepted to spans, if you would like to use Character Styles.
<html>
<head>
<style type="text/css">
.myStyle {
font-size: 16pt;
font-weight: bold;
font-style: italic
}
.myCharStyle {
font-size: 10pt;
font-weight: bold;
}
</style>
</head>
<body>
<div>
<h1><span>This is heading 1 style</span></h1>
<h2><span>This is heading 2 style</span></h2>
<h3><span>This is heading 3 style</span></h3>
<p>
<span>This is normal style </span>
<span class="myCharStyle">This is my character style</span>
</p>
<p class="myStyle"><span>This is my custom style</span></p>
</div>
</body>
</html>
Regarding hyperlinks, if you need to insert hyperlinks as a simple text, you can just remove hyperlinks from your HTML. For example, you can try using regular expressions to achieve this. Please see the following code:
// Read HTML string.
string html = File.ReadAllText(@"Test001\in.html");
// Replace hyperlinks in the HTML string with simple text.
Regex regex = new Regex("]*>(.*)", RegexOptions.Singleline | RegexOptions.IgnoreCase);
html = regex.Replace(html, "$1");
// Get HTML bytes and create stream.
byte[] htmlBytes = Encoding.UTF8.GetBytes(html);
MemoryStream htmlStream = new MemoryStream(htmlBytes);
// Create document from stream.
Document doc = new Document(htmlStream);
// Save output document.
doc.Save(@"Test001\out.doc");
Hope this helps.
Best regards.
Got the regex stuff working, no doubt I’ll get the styles working as well. The key seems to be the tag. Thanks once again, your support is excellent!