Aspose Words Automatically Adjusting Spaces and Shifting Text—Need to Preserve Original File/Text Structure

Hello,

We have a file that contains token syntax, and during Word file generation, we insert the token name (e.g., <<LONGDATEM01>>) as per the predefined format. After this, we replace the token name with 25 blank spaces. In the next step, we replace these blank spaces back with the token name.

However, this process disrupts the token’s placement, causing misalignment of token images or incorrect positioning due to text shifting.

We need a way to ensure that when the token name is replaced with blank spaces and then restored, the formatting, spacing, and alignment remain intact, preserving the original text structure.

We are using below versions,
Aspose.Pdf - 17.3.0 and Aspose.Words - 24.1.0

When a token appears at the end of a line, it does not wrap to the next line as expected. Additional spaces are being added at the end of the line, pushing the text beyond the set margin.

Token Name -

Token Placement -

Please let me know if you need any further information.

@shivrajp MS Word documents are flow by their nature, so it is expected that when content is changed the remaining content is reflowed. Aspose.Words does not control this behavior.

Could you please attach your input and output documents and code that will allow us to reproduce the problem on our side? We will check the documents and provide you more information.

Below are the document for reference,
doc4.docx (11.6 KB)

doc1.docx (11.6 KB)

doc2.docx (11.5 KB)

doc1.docx (11.6 KB)

Code -
First we run below code where we replace the tokens with the empty string,

Dim PageSize As New [PaperSize]

Using inStream As New MemoryStream(fileData)
    doc = New Aspose.Words.Document(inStream)
    Dim builder As New DocumentBuilder(doc)
    MultiDocxRef.PageCount = doc.PageCount
    Dim PageHeight As Int64 = builder.PageSetup.PageHeight
    Dim PageWidth As Int64 = builder.PageSetup.PageWidth
    Dim DocumentHeight As Int64 = doc.PageCount * PageHeight
    Dim DocumentWidth As Int64 = PageWidth
    MultiDocxRef.PageHeight = PageHeight
    MultiDocxRef.PageWidth = PageWidth
    PageSize = builder.PageSetup.PaperSize

    Dim options As New FindReplaceOptions()
    options.ReplacingCallback = New ReplaceEvaluatorSignature()
    Dim Regex As New Regex(TokenPattern, RegexOptions.None)
    'Replace all tokens with blank string  .
    doc.Range.Replace(Regex, "                        ", New FindReplaceOptions)
    'doc.Range.Replace(Regex, " ")
End Using

For Each section As Section In doc
    section.PageSetup.PaperSize = PageSize
Next

doc.UpdatePageLayout()

Here we get the token name back where we has replaced it with the empty string,

Using inStream As New MemoryStream(fileData)
    doc = New Aspose.Words.Document(inStream)
    Dim builder As New DocumentBuilder(doc)
    Dim PageHeight As Int64 = builder.PageSetup.PageHeight
    Dim PageWidth As Int64 = builder.PageSetup.PageWidth
    Dim DocumentHeight As Int64 = doc.PageCount * PageHeight
    Dim DocumentWidth As Int64 = PageWidth
    Dim options As New FindReplaceOptions()
    options.ReplacingCallback = New FindAndInsertBookmark()
    Dim Regex As New Regex(TokenPattern, RegexOptions.IgnoreCase)
    doc.Range.Replace(Regex, "", options)
End Using

@shivrajp As it was mentioned, MS Word documents are flow by their nature, so it is expected that when content is changed the remaining content is reflowed.

If your goal is to visually hide the token keeping the original document layout, probably you can simply change the font color of the token to white:

Document doc = new Document(@"C:\Temp\in.docx");

FindReplaceOptions opt = new FindReplaceOptions();
opt.ApplyFont.Color = Color.White;
opt.UseSubstitutions = true;
doc.Range.Replace("<<LONGDATEM01>>", "$0", opt);

doc.Save(@"C:\Temp\out.docx");

doc1.docx (11.6 KB)
out.docx (11.6 KB)

This solution is working for me.
Thank you Alexey!

1 Like

I need one more suggestion on below code,

If signedToken.TokenName.Contains("LONGDATE") Then
        Dim months As String() = {"march", "april", "may", "june", "july", "august"}

        If months.Any(Function(m) line.IndexOf(m, StringComparison.OrdinalIgnoreCase) >= 0) Then
            textFragment.TextState.Font = FontRepository.FindFont("Times New Roman") ' ✅ Corrected font setting
            textFragment.TextState.FontSize = 10
            textFragment.TextState.HorizontalAlignment = HorizontalAlignment.Center
            textFragment.HorizontalAlignment = HorizontalAlignment.Center
        Else
            textFragment.TextState.FontSize = 8.5
        End If
    End If

Here, line is ‘March 20, 2025’ or ‘August 20, 2025’ or ‘September 20, 2025’ then how I can change the font family of the same and please confirm text alignment center is also correct or not.

@shivrajp Could you please attach your input and expected output documents here for our reference? We will check the documents and provide you more information.

Please find attached document in which I have mentioned the details.
Date Token Center Align Text.docx (24.6 KB)

Thank you!

@shivrajp Thank you for additional information. You can use code like the following to achieve what you need:
[C#]

Document doc = new Document(@"C:\Temp\in.docx");

Regex dateRegex = new Regex("(January|February|March|April|May|June|July|August|September|October|November|December)\\s+\\d{1,2},\\s+\\d{4}");

FindReplaceOptions opt = new FindReplaceOptions();
// Set font name applied to matched text
opt.ApplyFont.Name = "Arial";
opt.ApplyFont.Color = Color.Red;
opt.ApplyFont.Bold = true;
// Use substitutions to replace the matched value with itself.
opt.UseSubstitutions = true;

doc.Range.Replace(dateRegex, "$0", opt);

// Post-process the document to apply paragraph alignment.
foreach (Run r in doc.GetChildNodes(NodeType.Run, true))
{
    if (dateRegex.IsMatch(r.Text))
        r.ParentParagraph.ParagraphFormat.Alignment = ParagraphAlignment.Center;
}

doc.Save(@"C:\Temp\out.docx");

[VB.NET]

Dim Doc As New Document("C:\Temp\in.docx")

Dim dateRegex As New Regex("(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d{1,2},\s+\d{4}")

Dim opt As New FindReplaceOptions()
' Set font name applied to matched text
opt.ApplyFont.Name = "Arial"
opt.ApplyFont.Color = Color.Red
opt.ApplyFont.Bold = True
' Use substitutions to replace the matched value with itself.
opt.UseSubstitutions = True

Doc.Range.Replace(dateRegex, "$0", opt)

For Each r As Run In Doc.GetChildNodes(NodeType.Run, True)
    If dateRegex.IsMatch(r.Text) Then
        r.ParentParagraph.ParagraphFormat.Alignment = ParagraphAlignment.Center
    End If
Next

Doc.Save("C:\Temp\out.docx")

out.docx (14.1 KB)

Just having a question about like when we have a token with text surrounded in multiple lines so at that time this solution will work?
Because the I think ParagraphFormat will update all the text of that paragraph.

@shivrajp You are free to change the code according to your needs. For example you can check whether the paragraph contains only the run with the matched text and change the paragraph alignment only in this case. The provided code simple demonstrates the technique.

Thank you Alexey.

1 Like

@alexey.noskov currently we are facing issues with recent changes. So, please have a look on it and suggest.
We have below code changes,

Dim options As New FindReplaceOptions()
options.ReplacingCallback = New ReplaceEvaluatorSignature()
options.ApplyFont.Color = System.Drawing.Color.White
options.UseSubstitutions = True
Dim Regex As New Regex(TokenPattern, RegexOptions.None)
'Replace all tokens with blank string  .
doc.Range.Replace(Regex, "$0", options)
Public Class ReplaceEvaluatorSignature
        Implements IReplacingCallback
Public Class ReplaceEvaluatorSignature
        Implements IReplacingCallback
        Private Function Replacing(ByVal e As ReplacingArgs) As ReplaceAction Implements IReplacingCallback.Replacing
            Dim currentNode As Node = e.MatchNode
            If e.MatchOffset > 0 Then currentNode = SplitRun(CType(currentNode, Run), e.MatchOffset)
            Dim runs As ArrayList = New ArrayList()
            Dim remainingLength As Integer = e.Match.Value.Length
            While (remainingLength > 0) AndAlso (currentNode IsNot Nothing) AndAlso (currentNode.GetText().Length <= remainingLength)
                runs.Add(currentNode)
                remainingLength = remainingLength - currentNode.GetText().Length
                Do
                    currentNode = currentNode.NextSibling
                Loop While (currentNode IsNot Nothing) AndAlso (currentNode.NodeType <> NodeType.Run)
            End While

            If (currentNode IsNot Nothing) AndAlso (remainingLength > 0) Then
                SplitRun(CType(currentNode, Run), remainingLength)
                runs.Add(currentNode)
            End If

            Dim TestName As String = "Nausherwan"
            Dim builder As DocumentBuilder = New DocumentBuilder(TryCast(e.MatchNode.Document, Aspose.Words.Document))
            builder.MoveTo(CType(runs(runs.Count - 1), Run))

            If TestName.Length > 10 Then
                builder.Write(TestName)
            Else
                Return ReplaceAction.Replace
            End If

            For Each run As Run In runs
                run.Remove()
            Next

            Return ReplaceAction.Skip
        End Function

Getting exception shown in below screenshot,

@shivrajp Most likely the problem occurs when you try to remove already removed nodes. Please try adding a condition to check whether node’s parent node is not null before removing it.

Also, it is not quite clear why you use IReplacingCallback here. It look like it is not required anymore.

I made below changes,

Private Function Replacing(ByVal e As ReplacingArgs) As ReplaceAction Implements IReplacingCallback.Replacing
            Dim currentNode As Node = e.MatchNode
            If e.MatchOffset > 0 Then currentNode = SplitRun(CType(currentNode, Run), e.MatchOffset)
            Dim runs As ArrayList = New ArrayList()
            Dim remainingLength As Integer = e.Match.Value.Length
            While (remainingLength > 0) AndAlso (currentNode IsNot Nothing) AndAlso (currentNode.NodeType = NodeType.Run)
                Dim run As Run = CType(currentNode, Run)
                runs.Add(run)
                remainingLength -= run.Text.Length
                currentNode = currentNode.NextSibling
            End While

            
            If (currentNode IsNot Nothing) AndAlso (remainingLength > 0) Then
                currentNode = SplitRun(CType(currentNode, Run), remainingLength)
                runs.Add(CType(currentNode, Run))
            End If

            
            If runs.Count = 0 Then Return ReplaceAction.Skip

            
            Dim builder As New DocumentBuilder(TryCast(e.MatchNode.Document, Aspose.Words.Document))
            builder.MoveTo(CType(runs(runs.Count - 1), Run))

            Dim TestName As String = "Nausherwan"
            If TestName.Length > 10 Then
                builder.Write(TestName)
            Else
                Return ReplaceAction.Replace
            End If

            
            For Each run As Run In runs
                If run IsNot Nothing Then
                    run.Remove()
                End If
            Next

            Return ReplaceAction.Skip
        End Function

So, are they correct? And one more thing earlier we were using below code and we changed it,

If TestName.Length > 10 Then
                builder.Write(TestName & "        " & TestName)
            Else
                builder.Write(TestName & "          " & TestName & "         " & TestName)
            End If

Please let me know if my approach is correct or not and please suggest changes if I am wrong.

@shivrajp In your condition you are checking whether the node is not null. but suggestion wat to check whether parent node of the node that is to be removed is not null:

For Each run As Run In runs
    If run.ParentNode IsNot Nothing Then
        run.Remove()
    End If
Next

As I mentioned, it looks like with the new approach IReplacingCallback is not needed anymore.

@alexey.noskov With the suggested changes still we are getting same issue and We have dependency due to which we cannot remove IReplacingCallback.
Please suggest the new approach if you have any.
We are getting below exception with current changes,

   at    .(Node , String , Int32 )
   at    .(Int32 , String )
   at    .(String , Int32 , Int32 )
   at    .(String , Int32 )
   at    .(Node )
   at    .()
   at ysi.commercialcafe.dealmanager.library.AppClasses.ySignatureApp.GetMultiDocXrefDimensionsAndSignedFile(ySignatureMultiDocXRef& MultiDocxRef, Int64 FileId, Document& CompleteDoc) in C:\tfs\Cafes\AngularCafe\Products\Commercial\devnew\DealManager\ysi.commercialcafe.dealmanager.library\AppClasses\Legal\ySignatureApp.vb:line 1219
   at ysi.commercialcafe.dealmanager.library.AppClasses.ySignatureApp.VB$StateMachine_11_CreateMultiDoc.MoveNext() in C:\tfs\Cafes\AngularCafe\Products\Commercial\devnew\DealManager\ysi.commercialcafe.dealmanager.library\AppClasses\Legal\ySignatureApp.vb:line 412

@shivrajp Could you please attach the problematic input document here for testing or create a simple console application that will allow us to reproduce the problem on our side? We will check the issue and provide you more information.

The issue has been resolved. Thank you!

1 Like