Delete all selected text from PDF document using TextFragmentAbsorber Class in Aspose.PDF for .NET

ashmid_a · August 9, 2015, 6:25am

I am using the TextFragmentAbsorber to iterate over the text fragments contained in a PDF document, and I would like to deleted selected TextFragment objects. How can I do this? The TextFragmentCollection class does not seem to have a Remove() option.

codewarior · August 10, 2015, 6:38am

Hi Avi,

Thanks for contacting support.

In order to remove TextFragment, you can replace the fragment with blank instance. Once the contents are replaced, you can re-arrange page contents to avoid any formatting issues. Please visit the following links for related information on

In case you encounter any issue, please share your resource files, so that we can test the scenario in our environment.

ashmid_a · August 10, 2015, 7:38pm

Please elaborate and explain what you mean by “replace the fragment with blank instance” (I examined the links, but they don’t have any examples of blank instances).

codewarior · August 11, 2015, 1:50pm

Hi Avi,

In order to accomplish your requirement, the text can be replaced with blank value and page contents can be auto adjusted as shown in the following code snippet. However, during my testing, I have observed that page contents are not being auto arranged. For the sake of correction, I have logged this problem as PDFNEWNET-39171 in our issue tracking system. We will further look into the details of this problem and will keep you updated on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

[C#]

// Load source PDF file
Document doc = new Document("c:/pdftest/42441893(1).pdf");

// Create TextFragment Absorber object with regular expression
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("organisationer");
textFragmentAbsorber.TextReplaceOptions.ReplaceAdjustmentAction = TextReplaceOptions.ReplaceAdjustment.AdjustSpaceWidth;

doc.Pages.Accept(textFragmentAbsorber);

// Replace each TextFragment
foreach (TextFragment textFragment in textFragmentAbsorber.TextFragments)
{
    /// Set font of text fragment being replaced
    textFragment.TextState.Font = FontRepository.FindFont("Arial");
    /// Set font size
    textFragment.TextState.FontSize = 12;
    textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.Navy;
    /// Replace the text with a larger string than the placeholder
    textFragment.Text = "";
}

/// Save resultant PDF
doc.Save("c:/pdftest/TextRemove.pdf");

ashmid_a · August 12, 2015, 7:21pm

Dear Nayyer,

1] Thank you for this explanation. I now understand my mistake. Previously I was setting the TextSegment.Text field to a blank string, which had no effect at all. I see now that I have to reset the TextFragment.Text field, rather than the TextSegment.Text field.

2] Indeed, as you note, after deleting selected characters, the rest of the text is completely misarranged. Worse, many of the text fields become doubled (e.g. fragments that were “2” are now “22”), and others have disappeared from the page completely. Please do notify us when this has been resolved.

Sincerely,

Avi

codewarior · August 13, 2015, 6:11am

ashmid_a:

2] Indeed, as you note, after deleting selected characters, the rest of the text is completely misarranged. Worse, many of the text fields become doubled (e.g. fragments that were "2" are now "22"), and others have disappeared from the page completely. Please do notify us when this has been resolved.

Hi Avi,

Concerning to above stated scenario, can you please share some resource files and code snippet, so that we can further look into this matter.

Please note that during my testing, I used one of my sample PDF files and did not notice contents duplication issue.

asad.ali · June 3, 2020, 9:16pm

@ashmid_a

Concerning your initial inquiry, we would like to share with you that now you can Remove All Text from PDF document using TextFragmentAbsorber which is a faster way to remove text. Please use the linked example with the latest version of the API and in case you need further information, please feel free to let us know.