Identifying Formatting in Words

Hi,

Is Aspose capable to capture formatting in a Words document? e.g. Bold, Size, Font, Underline… I need to process a word document and need to identify text that originally bold as more important.

Thanks

Shu Yih

Hi,

Sure, it’s possible. Extracting various parts of the document as well as their formatting is decribed here:

https://docs.aspose.com/words/net/how-to-extract-selected-content-between-nodes-in-a-document/

Hi,

I read that document. But i thought that is to extract header and footer in a document?

For example, i have this line of code

strResume = New Aspose.Word.Document(File.FullName).Range.Text

now the content of my document is in strResume.

How do i know which character is bold? or underline?

Thanks

Shu Yih

Range returns plain text without formatting so it’s impossible to know which character is bold or underline. Along with other kinds of objects, IDocumentVisitor is able to return formatted runs of text. Please look at the example shown in the article and create a similar class for your document. This method is called when a particular run of text is encountered:

public void RunOfText(Font font, string text)
{
    if (isExtracting)
    {
        switch (extractingType)
        {
            case StoryType.PrimaryHeaderStory:
                primaryHeader.Append(text);
                break;
            case StoryType.PrimaryFooterStory:
                primaryFooter.Append(text);
                break;
            case StoryType.MainTextStory:
                mainText.Append(text);
                break;
        }
    }
}

Font parameter represents formatting properties of the particular run of text. Use it to determine if the text is bold/underline/etc.