We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

What's special about special characters

When I’m adding some paragraphs containing special characters (©, ®, etc) in word, I’m expecting they should be stored in separate special char tags similar to how smart tags is stored in xml’s (when we unzip the document) but they are being stored in run node as a text.

Why they are not stored as separate tags for smart tags, what we are missing here? It would be helpful if you can just send a word document containing special chars and xml’s have special tags node in it.

Thanks.


Hi Praneeth,


Thanks for your inquiry. First of all, please note that Aspose.Words is quite different from the Microsoft Word’s Object Model in that it represents the document as a tree of objects more like an XML DOM tree. If you worked with any XML DOM library you will find it is easy to understand and work with Aspose.Words. When you load a Word document into Aspose.Words, it builds its DOM and all document elements and formatting are simply loaded into memory. Please read the following articles for more information on DOM:
http://www.aspose.com/docs/display/wordsjava/Object+Model+Overview
http://www.aspose.com/docs/display/wordsjava/Composition+Diagrams

Please note that all text of the document is stored in runs of text. Yes, the characters (©, ®, etc) in Word document are stored in Run node. This is the expected behaviour of Aspose.Words.

Moreover, If you insert these characters in Word document using MS Word, these will be added as follow in document.xml. The Runs most commonly contain text elements <w:t> (which contain the actual literal text of a paragraph)

<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif””><w:p w:rsidP=“00080A18w:rsidRDefault=“00080A18w:rsidRPr=“00080A18w:rsidR=“00080A18”><o:p></o:p>

<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif””><w:r><o:p></o:p>

<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
background:yellow;mso-highlight:yellow”><w:t
€ £ ¥</w:t><span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
color:blue”><o:p></o:p>

<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
color:blue”></
<span style=“font-size:10.0pt;
line-height:115%;font-family:“Arial”,“sans-serif”;color:#990000”>w:r
<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
color:blue”>>
<o:p></o:p>

u1:r

u1:t

</u1:t>

</u1:r><u1:bookmarkstart u1:name="_GoBack" u1:id=“0”>

</u1:bookmarkstart>

<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
color:blue”></
<span style=“font-size:10.0pt;
line-height:115%;font-family:“Arial”,“sans-serif”;color:#990000”>w:p
<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
color:blue”>>
<span style=“font-size:10.0pt;line-height:
115%;font-family:“Arial”,“sans-serif””><o:p></o:p>

Hi Tahir,
Thanks for your reply.

How to collect all the special characters in the document? I think it is a separate node as com/aspose/words/SpecialChar.html javadocs and is a child of paragraph. But if it is not represented as a separate tag how can we access the formatting associated with special chars.

Hi Praneeth,


Thanks for your inquiry.

A Microsoft Word document can include a number of special characters that represent fields, form fields, shapes, OLE objects, footnotes etc. For the list of special characters see ControlChar.

You can access the formatting of text/characters/symbols (©, ®, etc) using Run.Font property.

Hi Tahir,
Thanks for the reply. We still could not get in which case it could be a direct child of paragraph and NodeType.SPECIAL_CHAR should be used. Could you please provide some more information about NodeType.SPECIAL_CHAR as a Paragraph node? Thanks.

Hi Praneeth,


Thanks for your inquiry. SpecialChar class base class for special characters in the document.

SpecialChar char is used as a base class for more specific classes that represent special characters that Aspose.Words provides programmatic access for. The SpecialChar class is also used itself to represent special character for which Aspose.Words does not provide detailed programmatic access.

A
Microsoft Word document can include a number of special characters that
represent fields, form fields, shapes, OLE objects, footnotes etc. For
the list of special characters see ControlChar.


SpecialChar class base class for special characters in the document. E.g Fields in Word document. The FieldStart can only be a child of Paragraph. Please check the following class hierarchy and attached DOM image for detail. Hope this answers your query. Pleas let us know if you have any more queries.

System.Object
Aspose.Words.Node
Aspose.Words.Inline
Aspose.Words.SpecialChar
Aspose.Words.Fields.FieldChar
Aspose.Words.Fields.FieldStart

Thanks Tahir, could you please provide us the same word document for which you attached the DOM image? I’m curious about paragraph in footnote having SpecialChar and Run node. Thanks.

Hi Praneeth,


Thanks for your inquiry. Please check the attached document.