Issue with list IDs and numbering in DOCX parsing

list_example.docx (18.6 KB)

Hello. I’m importing a DOCX Word document and parsing it. When I call paragraph.getListFormat().getList().getListId() , I noticed that the list ID is 1 for both the first list and the list which is inside a table within a text box. The id for the list inside the text box is 3. These lists were created by copying the first list and pasting it. Is it correct that both lists have an ID of 1 when retrieved by Aspose? This causes incorrect list number values in my application - after parsing the document, the start index inside the table within the text box becomes 3, and the list starts from 4. How can I handle this situation?
If I right-click on the list inside the table in Word and select ‘Restart Numbering,’ the list ID changes to 4, and the numbering starts correctly from 1.

@Martin123 Aspose.Words returns list ID stored in the document. If you unzip your document you will see the first 3 paragraphs has list ID set to 1:

<w:p w14:paraId="449010D2" w14:textId="77777777" w:rsidR="00BA6945" w:rsidRDefault="00BA6945" w:rsidP="00BA6945">
	<w:pPr>
		<w:pStyle w:val="ListParagraph"/>
		<w:numPr>
			<w:ilvl w:val="0"/>
			<w:numId w:val="1"/>
		</w:numPr>
		<w:spacing w:after="0" w:line="240" w:lineRule="auto"/>
	</w:pPr>
	<w:r>
		<w:t>A</w:t>
	</w:r>
</w:p>

The same as list item in the textbox:

<w:p w14:paraId="4C378446" w14:textId="77777777" w:rsidR="00F73AA2" w:rsidRDefault="00F73AA2" w:rsidP="00F73AA2">
	<w:pPr>
		<w:pStyle w:val="ListParagraph"/>
		<w:numPr>
			<w:ilvl w:val="0"/>
			<w:numId w:val="1"/>
		</w:numPr>
	</w:pPr>
	<w:r>
		<w:t>A</w:t>
	</w:r>
</w:p>

So the values returned by Aspose.Words corresponds the values stored in the document. I have used the following code for testing:

Document doc = new Document("C:\\Temp\\in.docx");
doc.updateListLabels();

for (Paragraph p : (Iterable<Paragraph>)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (p.isListItem())
        System.out.println("List Id:" + p.getListFormat().getList().getListId() + "; List label: " + p.getListLabel().getLabelString());
}

Thank you for the explanation. Is there a property I can use to determine that even though this is the same list, the numbering needs to be restarted to look the same as in word docx file?

@Martin123 It looks like MS Word restarts numbering in text box. You can use Paragraph.ListLabel property to get actual list label value displayed for the specific list item. Just as shown in the code example above.

<w:numbering>
  <w:abstractNum w:abstractNumId="0">
    <w:lvl w:ilvl="0">
      <w:start w:val="1"/>
      <w:numFmt w:val="decimal"/>
      <w:lvlText w:val="%1."/>
      <w:lvlJc w:val="left"/>
      <w:pPr>
        <w:ind w:left="720" w:hanging="360"/>
      </w:pPr>
    </w:lvl>
    <!-- Define more levels if needed -->
  </w:abstractNum>
  <w:num w:numId="1">
    <w:abstractNumId w:val="0"/>
  </w:num>
</w:numbering>

@demtrio_parrilla The shared XML is not related to the question asked above.