I'm trying to generate a word document from an set of html pages. The Html is imported using documentbuilders inserthtml. Unfortunatly the html contains some random unwanted formatting. So I would like to do some basic reformatting on the text. For example I would like to set the font or textcolor to a specific value for a section or paragraph of text.
I could extract the plain text and then reinsert it while doing formatting. But then I would also loose al structure the html had. So what is the best way to do this?