Aspose PDF extract text

Dear Support,

Is there a way to remove array of strings from a pdf in certain region based on Rectangle co-ordinates?

Thanks
Ganesh.sv | 9600114171

@ganesh.sv

Thank you for contacting support.

You can redact any area of a page, as explained in Redact certain page region with RedactionAnnotation.

We hope this will be helpful. Please feel free to contact us if you need any further assistance.

@Farhan.Raza
Redaction will not work for me.
Redaction removes all the text in a rectangle.
But I have set of string that would be searched from the Rectangular area. Thereafter, remove that text OR replace with some other string.

@ganesh.sv

You can search and replace text by specifying Rectangular coordinates in below code snippet and replace or remove that word as per your requirements.

// load PDF file
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document((dataDir + "Test.pdf"));
foreach (Aspose.Pdf.Page page in pdfDocument.Pages)
{
    // instantiate TextFragment Absorber object
    Aspose.Pdf.Text.TextFragmentAbsorber TextFragmentAbsorberAddress = new Aspose.Pdf.Text.TextFragmentAbsorber("SEARCH");
    // search text within page bound
    TextFragmentAbsorberAddress.TextSearchOptions.LimitToPageBounds = true;
    // specify the page region for TextSearch Options
    TextFragmentAbsorberAddress.TextSearchOptions.Rectangle = new Aspose.Pdf.Rectangle(0, 0, page.PageInfo.Width, page.PageInfo.Height);
    // search text from first page of PDF file
    page.Accept(TextFragmentAbsorberAddress);
    // iterate through individual TextFragment
    foreach (Aspose.Pdf.Text.TextFragment tf in TextFragmentAbsorberAddress.TextFragments)
    {
        // Replace text
        tf.Text = "REPLACE";

        // Remove text
        // tf.Text = "";
    }
}
// save updated PDF file after text replace
pdfDocument.Save(dataDir + "TextReplaced_18.10.pdf");

We hope this will be helpful. Please feel free to contact us if you need any further assistance.