Inconsistent handling of Regex in Range.Replace

Hi all,

I’m seeing what I think (according to how I read the documentation) an inconsistent behavior when using a Regex expression as part of Range.Replace when the IReplacingCallback is implemented.

See below for example code which also has in the comments what is happening.

Public Sub test(iDocument As Document)

'The document has 2 ocurrences of the @Table_Start@ strings and all processing works as expected

Dim regex_start As New Regex("@Table_Start@", RegexOptions.IgnoreCase)

Dim obj_start As New FindNode()

Dim startPara As Paragraph = Nothing

Dim startParaArray As New ArrayList

iDocument.Range.Replace(regex_start, obj_start, False)

For Each node As Node In obj_start.nodes

startPara = DirectCast(node.ParentNode, Paragraph)

startParaArray.Add(startPara)

Next

'<>

'This works fine

If startParaArray.Count > 0 Then

iDocument.Range.Replace(regex_start, "")

End If

'The document also has multiple occurrences of a pattern that will match the regex below.

'Despite there being 2 ocurrences of @ref_para1@ and several others that match @ref_*@, only 1 ocurrence of @ref_para1@ is found*

Dim regex_ref As New Regex("@ref_(.)@", RegexOptions.IgnoreCase)

Dim obj_ref As New FindNode()

Dim refPara As Paragraph = Nothing

Dim refParaArray As New ArrayList

iDocument.Range.Replace(regex_ref, obj_ref, False)

For Each node As Node In obj_ref.nodes

refPara = DirectCast(node.ParentNode, Paragraph)

refParaArray.Add(refPara)

Next

'<>

'And this throws an exception "The match includes one or more special or break characters and cannot be replaced."

If refParaArray.Count > 0 Then

iDocument.Range.Replace(regex_ref, "")

End If

End Sub

Public Class FindNode

Implements IReplacingCallback

'Store Matched nodes in array list

Public nodes As New ArrayList()

Private Function IReplacingCallback_Replacing(e As ReplacingArgs) As ReplaceAction Implements IReplacingCallback.Replacing

Dim currentNode As Node = e.MatchNode

nodes.Add(currentNode)

'Signal to the replace engine to do nothing because we have already done all what we wanted.

Return ReplaceAction.Skip

End Function

End Class

Hi Brian,

Thanks for your inquiry. To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input Word document
  • Aspose.Words generated output document showing the undesired behavior
  • Your expected document which shows the correct output. Please create this document using Microsoft Word application.
  • Please create a standalone console application (source code without compilation errors) that helps us reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we’ll start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip them and Click ‘Reply’ button that will bring you to the ‘reply page’ and there at the bottom you can include any attachments with that post by clicking the ‘Add/Update’ button.

Best regards,

Hi Awais,

Project and sample files in the debug folder

Thanks

Hi Brian,

Thanks for your inquiry.

While using the latest version of Aspose.Words i.e. 16.4.0, we managed to reproduce this issue on our end. We have logged this issue in our bug tracking system. The ID of this issue is WORDSNET-13737. Your request has also been linked to the appropriate issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best regards,

Hi Brian,

Regarding WORDSNET-13737, our product team has completed the initial work on your issue and has come to a conclusion that this issue and the undesired behavior you’re observing is actually not a bug in Aspose.Words. So, we will close this issue as ‘Not a Bug’.

You need to change regex pattern to “lazy”. Please see section “Greedy and Lazy Quantifiers” in the fooling article:

https://docs.microsoft.com/en-us/dotnet/standard/base-types/quantifiers-in-regular-expressions

For your issue it will be like “@ref_(.*?)@” and it should find/replace all matches for your document.

Best regards,