GetChildNodes returns incorrect count for StructuredDocumentTag using .NET

Hi, we have attached a docx document containing 5 SDTs.

The problem is that the GetChildNodes method only find 1 SDT. It does not find the SDTs containing Section Breaks.

var docx = new Document("Test1.docx");
var nodes = docx.GetChildNodes(NodeType.StructuredDocumentTag, true);
Console.WriteLine(nodes.Count);

Until Aspose.Words version 19.6 this method was working well, and it was finding all 5 SDTs. But with the latest version of Aspose.Words it is not returning all SDTs. Is there a way to overcome this issue?

Attached docx.zip (17.0 KB)

@davidepedrocca

As per current Aspose.Words document model, only sections can be inserted into Document node. We already logged this feature request as WORDSNET-13519 in our issue tracking system. You will be notified via this forum thread once this feature is available. We apologize for your inconvenience.

Please check the detail of StructuredDocumentTag. StructuredDocumentTag can occur in a document in the following places:

  • Block-level - Among paragraphs and tables, as a child of a Body, HeaderFooter, Comment, Footnote or a Shape node.
  • Row-level - Among rows in a table, as a child of a Table node.
  • Cell-level - Among cells in a table row, as a child of a Row node.
  • Inline-level - Among inline content inside, as a child of a Paragraph.
  • Nested inside another StructuredDocumentTag.

Using Aspose.Words 19.6 we see that all of our SDTs are children of Body which, in turn, are children of Sections. So, I suppose the SDTs occur in Block-level. Am I wrong? All sections are children of the Document node.

We have been working since September 2017 and this functionality was always supported until Aspose.Words 19.6. What have you changed?

This change of functionality is a big problem for us; you cannot fail to consider a backward compatibility of the software. We have to update the version of the product as we need other fixes but we can’t.

@davidepedrocca

In your document, the section break is inside content control. You can check it in document.xml by unzipping your document. Moreover, you can check the children nodes of content control using following code example.

Document doc = new Document(MyDir + @"Test1.docx");
foreach (StructuredDocumentTag node in doc.GetChildNodes(NodeType.StructuredDocumentTag, true))
{
    foreach (Node child in node.ChildNodes)
    {
        Console.WriteLine(child.NodeType);
    }
}

Sorry, we do not understand your answer.

We know that the SDTs contain section breaks, but again it is more than 2 years we have been working and this functionality has always worked until Aspose.Words 19.6. We need this to be restored. Moreover, our ooxml structure is valid and MS Word opens the document correctly.

For us it is a big problem and this functionality cannot disappear, because our product is based on this.

We need this functionality to be restored and if necessary we will ask for a paid support to have this issue fixed.

@davidepedrocca

The behavior of old version of Aspose.Words is incorrect. Only section can be inserted into Document node in Aspose.Words document model. The latest version of Aspose.Words returns correct number of content controls.

However, we have logged this problem in our issue tracking system as WORDSNET-20257 . You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-20257) have been fixed in this Aspose.Words for .NET 20.5 update and this Aspose.Words for Java 20.5 update.

@davidepedrocca

You cannot read the content control that contains the section break in previous versions of Aspose.Words. Starting from Aspose.Words 20.7, you can read such content control. We have added read only properties for content control that contains the section break. You can find the detail of these properties from here:

Please use following code example to get the title of content controls.

var doc = new Document(MyDir + "my_sample.docx");
foreach (StructuredDocumentTagRangeStart tag in doc.GetChildNodes(NodeType.StructuredDocumentTagRangeStart, true))
    Console.WriteLine(tag.Title);

foreach (StructuredDocumentTag tag in doc.GetChildNodes(NodeType.StructuredDocumentTag, true))
    Console.WriteLine(tag.Title);

The issues you have found earlier (filed as WORDSNET-13519) have been fixed in this Aspose.Words for .NET 20.7 update and this Aspose.Words for Java 20.7 update.