How to get the value of the numbering in a docx file?

I am trying to get the value of the numbering of a paragraph thanks to aspose word python. Here is an example:

  1. Text1
  2. Text2
  3. Text3

Right now, when I try to read a Word document with Aspose and compare it with another, it only succeeds in reading this:
Text1
Text2
Text3

I know that the paragraph object has the property list_label.label_value, but it only returns 0. Do you know a way to get the actual value?

Thank you!

@lchevallier1299

To retrieve the actual numbering value of a paragraph in a DOCX file using Aspose.Words, you can utilize the ListLabel property of the Paragraph class. This property provides access to the label information associated with list items, including the numerical value.

Here’s a code example demonstrating how to extract the numbering value from paragraphs that are part of a list:

import com.aspose.words.*;

public class ExtractListNumbering {
    public static void main(String[] args) throws Exception {
        // Load the document
        Document doc = new Document("path/to/your/document.docx");
        
        // Update the list labels to ensure they are correctly calculated
        doc.updateListLabels();

        // Get all paragraphs in the document
        NodeCollection paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);

        // Iterate through each paragraph
        for (Paragraph paragraph : (Iterable<Paragraph>) paragraphs) {
            // Check if the paragraph is part of a list
            if (paragraph.getListFormat().isListItem()) {
                // Get the list label
                ListLabel label = paragraph.getListLabel();
                
                // Output the label value and the paragraph text
                System.out.println("List item: " + label.getLabelString() + " " + paragraph.getText().trim());
            }
        }
    }
}

Explanation:

  1. Load the Document: The document is loaded from a specified path.
  2. Update List Labels: This ensures that the list labels are correctly calculated based on the current state of the document.
  3. Iterate Through Paragraphs: The code checks each paragraph to see if it is part of a list.
  4. Retrieve List Label: For paragraphs that are list items, the ListLabel is accessed to get the numbering value and the formatted string.
  5. Output: The numbering and the paragraph text are printed to the console.

This approach will allow you to get the actual numbering values for paragraphs in a numbered list within a DOCX file. Make sure to replace "path/to/your/document.docx" with the actual path to your document.

Thank you for the answer. Unfortunately, I forgot to mention that I am using Aspose Words for Python.:sweat_smile:

@lchevallier1299 Here is Python code that prints list item labels:

doc = aw.Document("C:\\Temp\\in.docx")
# Update list labels to make them accessible through Paragraph.list_label property.
doc.update_list_labels()

# print list items labels
for p in doc.get_child_nodes(aw.NodeType.PARAGRAPH, True) :
    p = p.as_paragraph()
    if p.is_list_item:
        print(p.list_label.label_string)

We have tried also this property but it returns blanks. If it can help you, i joined an example of the text we are trying to extract. Thank you very much for your help !
Test.docx (15,5 Ko)

@lchevallier1299 To use Paragraph.list_label you should first call Document.update_list_labels() method as shown in the code example above.

Oh my bad, it indeed works ! Thank you !

1 Like

If I want to compare the numbering of the same paragraph between two documents, do I have to create my own function to compare it, or does the compare function handle it? Thank you!

@lchevallier1299 Document compare feature in Aspose.Words works similar to MS Word document compare, i.e. detected differences in the documents are marked as revisions. The same applied to numbering.