Problem extract content between nodes

absis · July 20, 2011, 3:47am

Hi,

I have a function that it extract content between nodes. The problem is when in this part of text include a numbering because it lose the format.

This is method to extract content between nodes:

Shared Function ExtractContentBetweenNodes(ByRef startNode As Words.Node, ByRef endNode As Words.Node) As Words.Document
    Dim sStr As String
    Dim oDocSource, oDocDst As Words.Document
    Dim firstSect, currNode, dstNode, sect As Words.Node
    Dim bEnd As Boolean = False

    Try
        'Comprobar si los nodos de inicio y fin son hijos del body del documento
        If startNode.ParentNode.NodeType <> Words.NodeType.Body Or endNode.ParentNode.NodeType <> Words.NodeType.Body Then
            sStr = "Para extraer el contenido de un marcador el inicio y fin de los nodos debes de ser hijos del cuerpo del documento."

            Throw New Exception(sStr)
        End If

        'Clonar el documento original.
        'Esto es necesario para preservar los estilos del documento original
        oDocSource = startNode.Document oDocDst = oDocSource.Clone
        oDocDst.RemoveAllChildren()

        'Ahora debemos de copiar los nodos padre del nodo inicial al documento destino
        firstSect = oDocDst.ImportNode(startNode.GetAncestor(Words.NodeType.Section), True, Words.ImportFormatMode.KeepSourceFormatting)
        oDocDst.AppendChild(firstSect)

        'Borrar el contenido de la seccion, exceptuando cabeceras y pies de página
        oDocDst.LastSection.Body.RemoveAllChildren()


        'Copiar el contenido
        currNode = startNode
        While Not bEnd
            'miramos si hemos llegado al fin del contenido a extraer
            If currNode.Equals(endNode) Then bEnd = True

            'importar el nodo
            dstNode = oDocDst.ImportNode(currNode, True)
            oDocDst.LastSection.Body.AppendChild(dstNode)


            'mover al siguiente nodo
            If currNode.NextSibling IsNot Nothing Then
                currNode = currNode.NextSibling

            Else
                'mover a la siguiente sección
                sect = currNode.GetAncestor(Words.NodeType.Section)
                If sect.NextSibling IsNot Nothing Then
                    dstNode = oDocDst.ImportNode(sect.NextSibling, True, Words.ImportFormatMode.KeepSourceFormatting)
                    oDocDst.AppendChild(dstNode)
                    oDocDst.LastSection.Body.RemoveAllChildren()
                    currNode = CType(sect.NextSibling, Words.Section).Body.FirstChild

                Else
                    Exit While
                End If
            End If
        End While

    Catch ex As Exception
        Throw ex
    Finally
        oDocSource = Nothing
        firstSect = Nothing
        currNode = Nothing
        dstNode = Nothing
        sect = Nothing
    End Try
    'To test the result
    oDocDst.Save("d:\test\out.odt")
    Return oDocDst
End Function

I attach the documet with this problem.

Thanks for all.

Best regards,

alexey.noskov · July 20, 2011, 2:07pm

Hi
Thanks for your request. Could you please specify between what nodes you are trying to extract content? Also please attach your output document too. This is help us to better understand the problem.
Best regards,

absis · July 21, 2011, 1:50am

Hi,

I attach the output document.

Start node is bookmarkstart and end node is bookmark end. This is an example:

'Get bookmark
oBk = oDocContent.Range.Bookmarks(sIdBookmark)
'Get start node and end node
oNodeIni = oBkContent.BookmarkStart.ParentNode
oNodeEnd = oBkContent.BookmarkEnd.ParentNode

Thanks for all.

Best regards,

absis · July 21, 2011, 1:50am

Hi,

I attach the output document.

Start node is bookmarkstart and end node is bookmark end. This is an example:

'Get bookmark
oBk = oDocContent.Range.Bookmarks(sIdBookmark)
'Get start node and end node
oNodeIni = oBk.BookmarkStart.ParentNode
oNodeEnd = oBk.BookmarkEnd.ParentNode

Thanks for all.

Best regards,

alexey.noskov · July 21, 2011, 2:02am

Hi
Thank you for additional information. To preserve numbering you have to use NodeImporter instead of Document.ImportNode method. Please see the following link for more information:
https://reference.aspose.com/words/net/aspose.words/nodeimporter/
I modified the code:

Shared Function ExtractContentBetweenNodes(ByRef startNode As Node, ByRef endNode As Node) As Document
    Dim sStr As String
    Dim oDocSource, oDocDst As Document
    Dim firstSect, currNode, dstNode, sect As Node
    Dim bEnd As Boolean = False

    Try
        'Comprobar si los nodos de inicio y fin son hijos del body del documento
        If startNode.ParentNode.NodeType <> NodeType.Body Or endNode.ParentNode.NodeType <> NodeType.Body Then
            sStr = "Para extraer el contenido de un marcador el inicio y fin de los nodos debes de ser hijos del cuerpo del documento."

            Throw New Exception(sStr)
        End If
        'Clonar el documento original.
        'Esto es necesario para preservar los estilos del documento original
        oDocSource = startNode.Document
        oDocDst = oDocSource.Clone
        oDocDst.RemoveAllChildren()
        ' Create NodeImporter object to copy content.
        Dim importer As New NodeImporter(oDocSource, oDocDst, ImportFormatMode.KeepSourceFormatting)
        'Ahora debemos de copiar los nodos padre del nodo inicial al documento destino
        firstSect = importer.ImportNode(startNode.GetAncestor(NodeType.Section), True)
        oDocDst.AppendChild(firstSect)
        'Borrar el contenido de la seccion, exceptuando cabeceras y pies de página
        oDocDst.LastSection.Body.RemoveAllChildren()

        'Copiar el contenido
        currNode = startNode
        While Not bEnd
            'miramos si hemos llegado al fin del contenido a extraer
            If currNode.Equals(endNode) Then bEnd = True
            'importar el nodo
            dstNode = importer.ImportNode(currNode, True)
            oDocDst.LastSection.Body.AppendChild(dstNode)

            'mover al siguiente nodo
            If currNode.NextSibling IsNot Nothing Then
                currNode = currNode.NextSibling

            Else
                'mover a la siguiente sección
                sect = currNode.GetAncestor(NodeType.Section)
                If sect.NextSibling IsNot Nothing Then
                    dstNode = importer.ImportNode(sect.NextSibling, True)
                    oDocDst.AppendChild(dstNode)
                    oDocDst.LastSection.Body.RemoveAllChildren()
                    currNode = CType(sect.NextSibling, Section).Body.FirstChild

                Else
                    Exit While
                End If
            End If
        End While
    Catch ex As Exception
        Throw ex
    Finally
        oDocSource = Nothing
        firstSect = Nothing
        currNode = Nothing
        dstNode = Nothing
        sect = Nothing
    End Try
    Return oDocDst
End Function

Hope this helps.
Best regards,

absis · July 21, 2011, 2:46am

This solution fixes the problem.

Thanks for the quick support.

Best regards,