Access and Modify each segment of text in a Word document

Hi,

I am testing Aspose.Words .NET to process Word documents. My scenario is, I want to access each segment of the text in a Word document, whether it is a single sentence, a paragraph, heading, or a caption of an image or table. After accessing the text, I want to edit (modify, add, or replace) each text and then put the text back at the same location of the document and save the updated document (or as a new document). I need a sample code to perform this operation. Here, I need to make sure that the formatting and overall appearance and structure of the Word document remain the same, as I only need to edit the text of the document.

Your prompt reply would be appreciated.

1 Like

@uax99 If your goal is to replace some text in your document you can easily achieve this using Find/Replace functionality. Please see our documentation for more information:
https://docs.aspose.com/words/net/find-and-replace/

Content in MS Word documents is not represented as simple text it is represented using nodes. Please see our documentation to learn more about Aspose.Words Document Object Model:
https://docs.aspose.com/words/net/aspose-words-document-object-model/
The text pieces are represented by Run nodes.

No, I don’t want find and replace. I don’t know what text there is in the document. So I need to get the text process and update that and put it back at the same location of the document.

So how would I be able to access each node that contains a piece of text and then update that text?

Thanks

@uax99 You can get all Run nodes using GetChildNodes method and then loop through the returned nodes:

Document doc = new Document(@"C:\Temp\in.docx");
NodeCollection runs = doc.GetChildNodes(NodeType.Run, true);
foreach (Run r in runs)
{
    // Process Run nodes
}
doc.Save(@"C:\Temp\out.docx");

But in my opinion DocumentVisitor is more convenient way to loop through the documents nodes.

Anyways before processing document, in your case, I would suggest you to call Document.JoinRunsWithSameFormatting to reduce number of runs in the document and make it more convenient to process the text pieces.