Blacken/ Annonymiing Text in pdf document

Dear support team,

how can i blacken/anonymizing text in pdf documents with Aspose.pdf?

I want to remove some text from the document, so that they cannot be extracted in any other way.
I also want to blacken the section where the text was.
In case that this is possible, can you please provide me a code sample?

@christian.schmidt.lf

Thank you for contacting support.

We would like to share with you that you can Redact certain page region with RedactionAnnotation as per your requirements.

We hope this will be helpful. Please feel free to contact us if you need any further assistance.

@Farhan.Raza

Thank you for your reply.

if I use the sample with Aspose.pdf Productversion: 2015.05.13 it works great.

But when I use the sample with Productversion: 17.10 it does not work correctly.

  1. My rectangle is: new Rectangle(174, 488, 187, 501)
  2. OverlayText = " "
  3. in the attached pdf document this will blacken 3 characters from the Word “Gewerblicher”

AnonymizingTextNotWorking (20743).pdf (23.5 KB)

Solution:

  1. rbl is correctly removed
  2. and correctly blacken
  3. But the text after the blackened is moved under the blackened area

If i select the blackened word and copy & paste it i get: “Gewicher” but the characters ‘c’ and ‘h’ are under the RedactionAnnotation.

@christian.schmidt.lf

The shared document [AnonymizingTextNotWorking (20743).pdf] does not contain any redacted text, so it appears like source PDF document. Kindly share the code snippet that you are using for redacting specific text, along with the generated PDF file so that we may investigate it to help you out.

Before sharing requested data, please ensure using the latest version, Aspose.PDF for .NET 18.4.1, in your environment.

I have updated Aspose.PDF to version 18.4.1 but the problem ist not solved.

I use following Code:

var content = File.ReadAllBytes(@“C:\temp\testDoc.pdf”);

        using (var inputStream = new MemoryStream(content))
        {
            using (var document = new Document(inputStream))
            {
                Page page = document.Pages[1];

                RedactionAnnotation annot = new RedactionAnnotation(page, new Rectangle(174, 488, 187, 501))
                {
                    FillColor = Color.Black,
                    Color = Color.Black,
                    BorderColor = Color.Black
                };

                document.Pages[1].Annotations.Add(annot);
                annot.Redact();                    

                document.Save(@"C:\temp\testDocRedacted.pdf");
            }
        }

The Document I’m using is: AnonymizingTextNotWorking (20743).pdf (23.5 KB)

The result document I get is: testDocRedacted.pdf (20.4 KB)

You can see that three characters (’'rbl") from the word “Gewerblicher” get blacken, but four characters (“erbl”) are removed.

This szenario happens if you want to redact parts of words.

Do you have any solution to fix this problem?
So that only the text under the blacken section get removed?

@christian.schmidt.lf

We have worked with the data shared by you and have been able to reproduce the problem of removing “erbl” instead of “rbl”. A ticket with ID PDFNET-44620 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.