The ListLabel class defines properties specific to a list label. The ListLabel.LabelString property is used to get a string representation of list label.
Following code example shows how to extract the list labels of all paragraphs that are list items.
Document doc = new Document(MyDir + "Rendering.docx");
doc.UpdateListLabels();
NodeCollection paras = doc.GetChildNodes(NodeType.Paragraph, true);
// Find if we have the paragraph list. In our document, our list uses plain Arabic numbers,
// which start at three and ends at six.
foreach (Paragraph paragraph in paras.OfType<Paragraph>().Where(p => p.ListFormat.IsListItem))
{
Console.WriteLine($"List item paragraph #{paras.IndexOf(paragraph)}");
// This is the text we get when getting when we output this node to text format.
// This text output will omit list labels. Trim any paragraph formatting characters.
string paragraphText = paragraph.ToString(SaveFormat.Text).Trim();
Console.WriteLine($"\tExported Text: {paragraphText}");
ListLabel label = paragraph.ListLabel;
// This gets the position of the paragraph in the current level of the list. If we have a list with multiple levels,
// this will tell us what position it is on that level.
Console.WriteLine($"\tNumerical Id: {label.LabelValue}");
// Combine them together to include the list label with the text in the output.
Console.WriteLine($"\tList label combined with text: {label.LabelString} {paragraphText}");
}
If this does not help you, please ZIP and attach your input Word document along with expected output. We will then provide you more information on it.
Thank you this did help a lot. I now get all the text with bullet styling. I am now trying to get the bullets that are under certain heading1 or bold styling. If you or anyone else knows how to achieve this that would be great. Any help is welcome.
Please note that formatting is applied on a few different levels. For example, let’s consider formatting of simple text. Text in documents is represented by Run element and a Run can only be a child of a Paragraph. You can apply formatting
- to Run nodes by using Character Styles e.g. a Glyph Style.
- to the parent of those Run nodes i.e. a Paragraph node (possibly via paragraph Styles).
- you can also apply direct formatting to Run nodes by using Run attributes (Font). In this case the Run will inherit formatting of Paragraph Style, a Glyph Style and then direct formatting.
In your case, we suggest you please iterate over all paragraphs of document as shown in my previous post. You can get the style of paragraph or font formatting of text as shown below. Hope this helps you.
foreach (Paragraph paragraph in paras.OfType<Paragraph>().Where(p => p.ListFormat.IsListItem))
{
if (paragraph.ParagraphFormat.StyleIdentifier == StyleIdentifier.Heading1)
{
//Your code...
}
if (paragraph.Runs[0].Font.Bold == true)
{
//Your code...
}
}