Stripping Html attributes while writing content to a word document

Hello,


I’m using the insertHtml method of the DocumentBuilder class to insert HTML into the document.

When this document is saved as HTML using document.save(), the tag attributes like span attributes like Id, Name, Time stamp, title…etc are not visible while reading the same document.

I have gone through one of the aspose forum thread below is the link


For example:

While writing text to document the span tag look like below:

<span style=“background:#C2D69Bclass=“editorguidelines” timestamp=“1453892184131” title=“Inserted by Admin CCDS on 1/27/2016, 4:26:24 PM”> a powerful way to help you prove your point. When you click Online Video, you can paste in the embed code for the video you want to add.

While reading text from document the span tag look like below:

a powerful way to help you prove your point. When you click Online Video, you can paste in the embed code for the video you want to add.


In the tag class, timestamp and title attributes are missing how can i retain using Aspose.Words.

Any help on this would be appreciated.
Please do the needful ASAP.

Thanks
Hi Srini,

Thank you for your inquiry. Aspose.Words does not import the shared span's attributes in Aspose.Words DOM. When you load a Word or Html document into Aspose.Words, it builds a DOM (Document Object Model) in memory which allows you to programmatically read, manipulate and modify content and formatting of a Word document. You can simply obtain detailed programmatic access to document elements and formatting by using the classes of the Aspose.Words DOM.

Hi tahir,

Thank you for your reply. Can i write customized styles using classes in Aspose.words.

I mean in the shared post there is a class "editorguidelines" can i replace that with the Aspose.words classes and write the content to the word document, so that when i read the same document can i get Aspose.words classes through programatically.

If it is possible can you please share the code.


Thanks
Hi Srini,

Thanks for your inquiry. Please note that formatting is applied on a few different levels. For example, let’s consider formatting of simple text. Text in documents is represented by Run element and a Run can only be a child of a Paragraph. You can apply formatting

1) to Run nodes by using Character Styles e.g. a Glyph Style,
2) to the parent of those Run nodes i.e. a Paragraph node (possibly via paragraph Styles)
3) you can also apply direct formatting to Run nodes by using Run attributes (Font). In this case the Run will inherit formatting of Paragraph Style, a Glyph Style and then direct formatting.

Yes, the class 'editorguidelines' will be imported into Aspose.Words DOM. In this case, you need to use the class as shown in following html. Please let us know if you have any more queries.

.editorguidelines
{
font-size: 40px;
}
Test