Free Support Forum - aspose.com

Issue removing text from a pdf

I’m having issues removing text in some pdfs. I’m trying to remove some text at an angle from a pdf.
The text is removed correctly but some elements of the page that are not related shift position horizontally.

I’ve tried to check the positions after editing the document , but before saving, and the positions of the text fragments are the same. Only after saving does it lose original position of some elements.
Bellow is the code( used to remove text at an angle):

As you can see in the annexed images (data censored), the highlighted value shifted position horizontally after removing the slanted text.

public void MssRemoveWatermarks2(byte[] ssLicense, byte[] ssOriginalFile, int ssangle, out byte[] ssCleanFile, out bool ssResult, out string ssResultMessage) {
			ssCleanFile = new byte[] {};
			ssResult = false;
			ssResultMessage = "";
            try
            {
                setPDFLicense(ssLicense);

                using (MemoryStream inputStream = new MemoryStream(ssOriginalFile))
                {
                    // Open document
                    using (Aspose.Pdf.Document document = new Aspose.Pdf.Document(inputStream))
                    {

                        TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();

                        textFragmentAbsorber.TextReplaceOptions.ReplaceAdjustmentAction = TextReplaceOptions.ReplaceAdjustment.None;

                        document.Pages.Accept(textFragmentAbsorber);

                        TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

                        foreach (TextFragment tf in textFragmentCollection)
                        {
                            
                            if (tf.TextState.Rotation >= ssangle || tf.TextState.Rotation <= -ssangle)
                            {
                                
                                tf.Text = "";
                            }
                        }

                        // Return PDF without watermarks
                        using (MemoryStream outputStream = new MemoryStream())
                        {
                            document.Save(outputStream, Aspose.Pdf.SaveFormat.Pdf);
                            ssCleanFile = outputStream.ToArray();
                            ssResult = true;
                        }
                    }
                }
            }
            catch (Exception e)
            {
                ssResultMessage = "error: " + e.Message;
            }
			}

original1.png (13.3 KB)
TextRemoved1.png (4.2 KB)

@BFMarques

Thank you for contacting support.

We are looking into this and will get back to you shortly.

@BFMarques

Would you please share source and generated PDF files so that we may try to reproduce and investigate it in our environment. Please also mention how MssRemoveWatermarks2 method is called because value of ssangle and other variables are not specified. Before sharing requested data, please ensure using Aspose.PDF for .NET 19.11.

Ok. Annexed two files: TestFileOriginal s the original file , Result the saved file.
Regarding the variables:
sslicense - the license file
ssOriginalFile - The original file
ssangle - theshold angle, if the angle is more than this remove text
The others are outputs.

To simplify

ssCleanFile = new byte[] {};
			ssResult = false;
			ssResultMessage = "";
            try
            {
                setPDFLicense(ssLicense);// fuction to set the license file

                using (MemoryStream inputStream = new MemoryStream(ssOriginalFile))
                {
                    // Open document
                    using (Aspose.Pdf.Document document = new Aspose.Pdf.Document(inputStream))
                    {

                        TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();

                        textFragmentAbsorber.TextReplaceOptions.ReplaceAdjustmentAction = TextReplaceOptions.ReplaceAdjustment.None;

                        document.Pages.Accept(textFragmentAbsorber);

                        TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

                        foreach (TextFragment tf in textFragmentCollection)
                        {
                            
                            if (tf.TextState.Rotation >= 10 || tf.TextState.Rotation <= -10)
                            {
                                
                                tf.Text = "";
                            }
                        }

                        // Return PDF without watermarks
                        using (MemoryStream outputStream = new MemoryStream())
                        {
                            document.Save(outputStream, Aspose.Pdf.SaveFormat.Pdf);
                            ssCleanFile = outputStream.ToArray();
                            ssResult = true;
                        }
                    }
                }
            }
            catch (Exception e)
            {
                ssResultMessage = "error: " + e.Message;
            }

TestFileOriginal.pdf (91.0 KB)
TestFileResult.pdf (92.2 KB)

@BFMarques

We are unable to reproduce the issue with latest version of the API. Please ensure using latest version and then share your kind feedback with us. TestFileResult_19.11.pdf