I am trying to replace some text in a PDF using the replace feature in Aspose.PDF using the .NET implementation.
When using the replace feature there are some random spaces introduce in the line.
For example I tried to replace "“Diesem Angebot” simply by the letter “m”. As you see in my screenshot, there are random spaces introduced in the rest of the line. Is there any way to get rid of them?
bug1.png (41.1 KB)
bug2.png (30.9 KB)
@scientillion
Would you please share your sample PDF document along with sample code snippet so that we can test the scenario in our environment and address it accordingly?
page_0003.pdf (30.9 KB)
The PDF I used is the one above. The code doing the actual replace is:
//get the extracted text fragments
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
//loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
double fragmentWidth = textFragment.Page.CropBox.Width;
double fragmentHeight = textFragment.Page.CropBox.Height;
double fragmentWidthRatio = fragmentWidth / doc.PageInfo.Width;
double fragmentHeightRatio = fragmentHeight / doc.PageInfo.Height;
double llyReverseVal = doc.PageInfo.Height - (textFragment.Rectangle.LLY / fragmentHeightRatio);
double llxRationValue = textFragment.Rectangle.LLX / fragmentWidthRatio;
if (llxRationValue >= shapeX && llxRationValue <= shapeWleft
&& llyReverseVal >= shapeY && llyReverseVal <= shapeHtop)
{
//update text and other properties
textFragment.Text = txtReplace;
}
}
@scientillion
We are checking it and will get back to you shortly.
Okay thanks! I tried other PDFs in the meantime and I get the same behavior: When trying to replace just a small part of a line somehow random spaces are introduced in the remaining parts of the line.
@scientillion
Would you please check the attached output PDF and the below code snippet that we used to test the scenario in our environment while using Aspose.PDF for .NET 21.12:
string DocPath = dataDir + @"page_0003.pdf";
Document pdfDocument = new Document(DocPath);
string searchParagraphText = @"Diesem Angebot";
string repParagraphSearch = @"m";
string regSearchText = searchParagraphText.Replace(" ", @"\s*");
Rotation rt;
foreach (Page page in pdfDocument.Pages)
{
rt = page.Rotate;
page.Rotate = Rotation.None;
var textFragmentAbsorber = new TextFragmentAbsorber("(?i)(" + regSearchText + ")", new TextSearchOptions(true));
textFragmentAbsorber.TextReplaceOptions = new TextReplaceOptions(TextReplaceOptions.ReplaceAdjustment.WholeWordsHyphenation);
page.Accept(textFragmentAbsorber);
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
foreach (TextFragment textFragment in textFragmentCollection)
{
textFragment.Text = repParagraphSearch;
}
}
string newDocPath = dataDir + @"page_0003_out.pdf"; //New Generate Pdf
pdfDocument.Save(newDocPath);
page_0003_out.pdf (28.8 KB)
You can see in the attached PDF that no space issue is present. Can you please make sure that you are using the latest version of the API and if you are still facing any issues, please share a sample console application that is able to replicate the issue so that we can address it accordingly.