Replace text in PDF using C# and Aspose.PDF | Strange Behavior of the API

I am attempting to replace a paragraph of text with a new one. I am finding the text using a textfragmentabsorber and a rectangle. I find that there’s portions of text still there even if I just set it to empty text, and which text is left varies by the ReplaceAdjustment flag…
Here’s a link to the orignal file:

https://1drv.ms/b/s!AgfCWFajPdC67VqUhMEgJTykZKyn

A link to an example after running above code with ShiftRestOfLine:
https://1drv.ms/b/s!AgfCWFajPdC67VwzP-TLsoHaHxiO?e=7PhyTv

And a link to example with None:
https://1drv.ms/b/s!AgfCWFajPdC67VtIj_P_d1o-oYsT?e=BBsj3S

Aspose.Pdf.Rectangle rectForDescription = new Aspose.Pdf.Rectangle(20, 461, 149, 525);

      
        // iterate through individual TextFragment

        Aspose.Pdf.Text.TextFragmentAbsorber textFragmentAbsorber2 = new Aspose.Pdf.Text.TextFragmentAbsorber();
        // search text within page bound
        textFragmentAbsorber2.TextSearchOptions.LimitToPageBounds = true;

        // specify the page region for TextSearch Options
        textFragmentAbsorber2.TextSearchOptions.Rectangle = rectForDescription; // Docs are wrong - should be lX, lY, ux, uy not height and width
        //int currentDescrLength = GetSegmentsTextLength(pdf, rectForDescription);
        //if (currentDescrLength < introText.Length)
        //    textFragmentAbsorber2.TextReplaceOptions.ReplaceAdjustmentAction = Aspose.Pdf.Text.TextReplaceOptions.ReplaceAdjustment.AdjustSpaceWidth;
        //else
            //textFragmentAbsorber2.TextReplaceOptions.ReplaceAdjustmentAction = Aspose.Pdf.Text.TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation;
        textFragmentAbsorber2.TextReplaceOptions.ReplaceAdjustmentAction = Aspose.Pdf.Text.TextReplaceOptions.ReplaceAdjustment.ShiftRestOfLine;
        Aspose.Pdf.Text.TextFragmentCollection textFragmentCollection2 = textFragmentAbsorber2.TextFragments;

        // search text from first page of PDF file
        pdf.Pages[1].Accept(textFragmentAbsorber2);

      
        string txtInfo = "";
        bool isFirst = true;
        foreach (Aspose.Pdf.Text.TextFragment tf in textFragmentCollection2)
        {
            tf.Text = " ";
            foreach (Aspose.Pdf.Text.TextSegment seg in tf.Segments)
            {
                seg.Text = " ";
            }
            
        }
        pdf.Save();

@collomd

I have noticed that the problem is fixed if you adjust the upper right x value a little bit. The following code removes all text from the rectangle:

Aspose.Pdf.Rectangle rectForDescription = new Aspose.Pdf.Rectangle(20, 461, 163, 525);

Well that works if the type is set to none, if I set it to WholeWordsHyphenation I still get the partial text?
What the best way to replace a paragraph like that with another set of text and have it wordwrap? The wrap seems to work for the first line but then I get the left over text from before:
Here’s a replacement of the code (the rest is the same except the WholeWordsHyphenation is used)
bool isFirst = true;
foreach (Aspose.Pdf.Text.TextFragment tf in textFragmentCollection2)
{
tf.Text = " ";
if (isFirst)
tf.Text = introText;
isFirst = false;
}
pdf.Save();

If I do this I get the first line, wrapped and then the extraneous text show back up (it also shows up even if I set it to empty but set the ReplaceAdjustment to WholeWordsHyphenation

@collomd

Thank you for acknowledging that works. Can you please share sample text to replace and your output PDF file so that we know how much long text are you trying to replace that causes problem.

Well the actual text will vary in the solution since it’s database driven, but I have the issue with ANY sample text I use (the text will be short enough to fit in the rectangle provided in the initial query but could be longer than the current text…Here’s an example with text, note it only word wraps the first line and the text remains…

bool isFirst = true;
foreach (Aspose.Pdf.Text.TextFragment tf in textFragmentCollection2)
{
tf.Text = " ";
if (isFirst)
tf.Text = “This is sample text that we might include, and usually ends up with strange results”;
isFirst = false;
}
pdf.Save();

https://1drv.ms/b/s!AgfCWFajPdC67V1xQI77pX3_eEcu?e=CaRiGG

@collomd

I have been able to reproduce the issue on our end. A ticket with ID PDFNET-49895 has been created in our issue tracking system to further investigate the issue on our end. This thread has been linked with the issue so that you may be notified once the issue will be fixed.