Wrong replace multiple space

Hello. I use code to replace spaces before the characters A, B, C, and D like this. But why does this case have an error: the content of option B moves to option C? Thank you.

Dim repo As New Replacing.FindReplaceOptions()
strABCD = "ABCD"
For i = 0 To 3
	doc.Range.Replace(New Regex(" {2,}" + strABCD(i) + "\."), ControlChar.Tab + strABCD(i) + ".", repo)
Next

Input.docx (21.0 KB)

Output.docx (21.4 KB)

@asposewuser

It seems that you are encountering an issue with your code that replaces multiple spaces before the characters A, B, C, and D. The problem arises when the content of option B moves to option C, which indicates that the replacement logic may not be handling the spacing correctly.

In your current implementation, you are using a regular expression to match two or more spaces followed by a character (A, B, C, or D) and a period. However, if the replacement is not correctly aligned with the text structure, it can lead to unintended shifts in the content.

To address this issue, consider the following suggestions:

  1. Check the Regular Expression: Ensure that your regex pattern accurately captures the intended spaces and characters. You might want to test the regex separately to confirm it behaves as expected.

  2. Use a More Specific Replacement Logic: Instead of replacing all occurrences in a loop, you could refine your approach to ensure that each replacement is handled distinctly. For example, you can use a single regex pattern that captures all characters (A, B, C, D) and their preceding spaces in one go.

  3. Example Code: Here’s an example of how you might adjust your code to ensure that replacements are handled correctly:

    Dim repo As New Replacing.FindReplaceOptions()
    Dim strABCD As String = "ABCD"
    Dim regexPattern As String = " {2,}([ABCD])\."
    Dim matches As MatchCollection = Regex.Matches(doc.Range.Text, regexPattern)
    
    For Each match As Match In matches
        Dim character As String = match.Groups(1).Value
        doc.Range.Replace(match.Value, ControlChar.Tab + character + ".", repo)
    Next
    

This code captures the character (A, B, C, or D) in a single regex match and replaces it accordingly, which should help prevent the content from shifting incorrectly.

If the issue persists, please provide more details about the structure of your document or any specific examples of the text that is causing the problem. This will help in diagnosing the issue further.

Thankyou. But still the same.

@asposewuser You can achieve what you need using substitutions. Please see the following code:

[C#]

Document doc = new Document(@"C:\Temp\in.docx");

FindReplaceOptions opt = new FindReplaceOptions();
opt.UseSubstitutions = true;

doc.Range.Replace(new Regex(@"\s{2,}([ABCD]\.)"), "\t$1", opt);

doc.Save(@"C:\Temp\out.docx");

[VB.NET]

Dim Doc As New Document("C:\Temp\in.docx")

Dim opt As New FindReplaceOptions()
opt.UseSubstitutions = True

Doc.Range.Replace(New Regex("\s{2,}([ABCD]\.)"), ControlChar.Tab + "$1", opt)

Doc.Save("C:\Temp\out.docx")

out.docx (18.0 KB)

What version are you using? I tried it and still got the error. But it is option A.

@asposewuser I use the latest 25.3 version of Aspose.Words for testing.

I use version 20.6 and it still has errors. I just tried v25.3 trial version and it worked fine. Is there any way to use the old version? It’s too difficult to upgrade to v25.3.

@asposewuser Yes, you are right, it looks like a bug in the old version of Aspose.Words. I tested other versions and the problem has been resolved in 23.2 version of Aspose.Words. Probably you can update to this version.

This is too difficult. Because of the budget issue and the code of other parts will have errors that need to be updated when the new version is released.

@asposewuser Unfortunately, the only workaround I can suggest is to postprocess the document after performing replace operation:

Dim Doc As New Document("C:\Temp\in.docx")

Dim opt As New FindReplaceOptions()
opt.UseSubstitutions = True

Doc.Range.Replace(New Regex("([ABCD]\.)"), ControlChar.Tab + "$1", opt)

Dim testRegex As New Regex("\t[ABCD]\.")
For Each r As Run In Doc.GetChildNodes(NodeType.Run, True)
    If testRegex.IsMatch(r.Text) Then
        Dim prevRun As Run = TryCast(r.PreviousSibling, Run)
        While prevRun IsNot Nothing
            prevRun.Text = prevRun.Text.TrimEnd()
            If prevRun.Text = "" Then
                prevRun = TryCast(prevRun.PreviousSibling, Run)
            Else
                Exit While
            End If
        End While
    End If
Next

Doc.Save("C:\Temp\out.docx")

Thanks a lot for your help. I have understood a lot more.

1 Like