I recently found out how to read text from a document, but it is in plain text/unformatted
Is it possible to read out text formatted as well? If so, how?
Not at the moment, but we are interested in your ideas how you see it working for you. It might help us define a new feature for the product.
We are thinking of building API similar to MS Word Automation Object Model but it is very difficult technically because of the number of live collections in the model.
Well, I’d want to read the text formatted, so I’d imagine a property besides Range.Text, like Range.FormattedText, or Range.RichText or so, which contains the text including formatting elements.
I understand this may become a lot (since there are so many properties), but I’d first say start with some simple, most used properties like fontstyle: ‘Bold’, Underline, Italic, font and different sizes of letters. If I had that, it would be very helpfull already.
Range.FormattedText or Range.RichText what data type do you see for that? Do you want it a string in RTF or HTML format or you need it all objecticized?
You realize the document consists of sections, paragraphs and text runs. Bold or other formatting can be applied to any portion of text etc etc. It will grow into a very complex object model and I don’t yet see a way to make it in simple steps.
Maybe you can tell what you task is, why you need to retrieve formatting properties and I will be able to come up with an idea.
Also see this topic IDocumentWriter and let me know if this is relevant.
Yes, I’d think about Range.RichText as a String property. ‘Under the hood’ there must be some kind of object model/ tree model to keep the structure of the document in, including the formatting properties.
If you have such a structure, and you only keep formatting options like fontstyles and sizes, and forget about all other options, it would already be very welcome.
This structure I’m talking about could be the same as for a possible IDocumentWriter.
My task is to read out pieces of a word-document and put it in a memo-box on a webpage where the user can edit and save all his data in (preferrable formatted text), and finally save it in a database, or back into a word-document.
We plan to support HTML and RTF import and export so if you want it as a string, you will be able to obtain the document in HTML.
I also understand the need to allow access to the formatted document content, but I still need to think about how it should look like. So no estimated delivery date yet, maybe sometime in August.