Adding a pdf file for Redaction C#

Good afternoon,

I am working on redacting a pdf file using the aspose.pdf in c#. How would you go about it if you want your file to be uploaded into a web api by anyone instead of having a path like the ones in the documentation and practice, and then download the file back as a pdf after the redaction has been done.

The fileContentResult is being used to return a file back which has worked with other type of file.

Thank you and any input is welcome

@usz.a

Aspose.PDF for .NET is an on-premise API and it does not provide feature to download PDF document that is uploaded somewhere neither it provides any functionality to upload the file. On the other hand, it enables you to read a PDF document in form of Stream and Save it into Stream for output. Both streams can be managed using C# functions for uploading and downloading the file content. Please feel free to add more in case we missed something or misunderstood your question.

Thank you for your response @asad.ali

My question then is how would you utilize Aspose pdf in read the pdf in a form of stream. I have been able to figure out how to save it and return a downloadable file by testing with other methods. Whenever i try to redact my file, it does not work.

@usz.a

Please check the below documentation topic in order to read a PDF file using Stream:

In case you are facing some issue while redacting the content, please share your sample PDF file along with sample code snippet so that we can test the scenario in our environment and address it accordingly.

Afternoon @asad.ali

I made use of the link you shared but it doesn’t exactly perform what I want.

To put it simply, I would like the pdf file to be open in stream and then returned back there so it becomes downloadable after the redaction process has been performed on it.

I am currently and hoping to keep using fileContentResult because it is supported by the swagger UI that I am using as of now. I have attached my code snippet and a sample file to work with.

Ideally I would love for the opportunity to have to maybe be able to have a second argument for the method where in a way that enables me to enter the file as an argument.

Thank you for your help.

        public FileContentResult FlattenPdf(string flattenResult = "flattened.pdf")
    {
        string witnessFile =" Witness Statement Form DRAFT v1.pdf";
        flattenResult = Path.GetExtension(flattenResult).Equals(".PDF", StringComparison.OrdinalIgnoreCase) ? flattenResult : string.Format("{0}.pdf", flattenResult);
        var webCleint = new WebClient();
        var strWebResource = _resourcePath + witnessFile;
        //Helps to create a result
        
        using var mem = new MemoryStream();

        webCleint.OpenRead(strWebResource)?.CopyTo(mem);

        Form pdfForm = new Form();

        pdfForm.BindPdf(mem);

        pdfForm.FlattenAllFields();

        pdfForm.Save(flattenResult);

        var result = mem.ToArray();
        return new FileContentResult(result, "application/pdf") { FileDownloadName = flattenResult };

    }

Witness Statement Form DRAFT v1.pdf (103.9 KB)

@usz.a

By redaction, do you mean flattening the PDF document (as your code explains it)?

Please explain a bit more like what type of second argument you expect and in which method? e.g. Document.Save("FilePath", "SecondArgument??"). Furthermore, if you are actually redacting content inside a PDF using RedactionAnnotations, please share the corresponding code part for our reference as well.

Should have clarified. I am working on both redaction and flattening the pdf.
For the second argument I am looking for something like public FileContentResult RedactPdf(string inputPath, string redactResult = “redactedPage.pdf”), where the first argument has to do inputting files and the other is for the result of the block of code.

Here is the code snippet for the redaction:

 public FileContentResult RedactPdf(string inputPath, string redactResult = "redactedPage.pdf")
    {
       
        redactResult = Path.GetExtension(redactResult).Equals(".PDF", StringComparison.OrdinalIgnoreCase) ? redactResult : string.Format("{0}.pdf", redactResult);
        using var mem = new MemoryStream();

        Aspose.Pdf.Document doc = new Aspose.Pdf.Document(inputPath);

        RedactionAnnotation annot = new RedactionAnnotation(doc.Pages[1], new Aspose.Pdf.Rectangle(200, 500, 300, 600));
        annot.FillColor = Aspose.Pdf.Color.DarkBlue;
        annot.BorderColor = Aspose.Pdf.Color.SpringGreen;
        annot.Color = Aspose.Pdf.Color.DarkGreen;

        annot.OverlayText = "REDACTED";

        annot.TextAlignment = Aspose.Pdf.HorizontalAlignment.Center;
        annot.Repeat = true;

        doc.Pages[1].Annotations.Add(annot);

        annot.Redact();
           
        doc.Save(redactResult);

        var result = mem.ToArray();
        return new FileContentResult(result, "application/pdf") { FileDownloadName = redactResult };

    }

Thank you and please let me know if you have any more questions.

@usz.a

Have you checked the article of saving PDF document in the Web Applications?

Instead of sending data to FileContentResult that only is working in Swagger UI, you can use Document.Save() method that sends a file to response for download and takes file name as an argument as well.

I wasn’t aware you have that feature, first time using Aspose. I will check it out.

Thank you.

@usz.a

Please take your time to check this method and feel free to let us know in case you need more information.

Hello @asad.ali

Still working on the flatten pdf method, and having this error: Aspose.Pdf.InvalidPdfFileFormatException: ‘Incorrect file header’

I have seen examples of people using BindPDF on memoryStream and I have done the exact same thing

Here’s my code snippet:

public FileContentResult FlattenPdf(IFormFile uploadedFile, string flattenResult = “flattened.pdf”)
{

        flattenResult = Path.GetExtension(flattenResult).Equals(".PDF", StringComparison.OrdinalIgnoreCase) ? flattenResult : string.Format("{0}.pdf", flattenResult);

        using var mem = new MemoryStream();

        var openFilePath = Path.Combine(@"c:\Users\john.doe\Downloads", uploadedFile.FileName);

        using var stream = new FileStream(openFilePath, FileMode.Open);
        uploadedFile.CopyToAsync(stream);
        stream.CopyTo(mem);

        Form pdfForm = new Form();

        pdfForm.BindPdf(mem);

        pdfForm.FlattenAllFields();

        pdfForm.Save(mem);

        var result = mem.ToArray();
        return new FileContentResult(result, "application/pdf") { FileDownloadName = flattenResult };

    }`

Still using the same pdf file as before if you need to test anything.

Please let me know if there’s anything wrong with the code or something.

Thank you!

@usz.a

We tested the scenario in our environment while using below code snippet in a Console Application:

using var mem = new MemoryStream();
using var stream = new FileStream(dataDir + "Witness Statement Form DRAFT v1.pdf", FileMode.Open);
stream.CopyTo(mem);
Facades.Form pdfForm = new Facades.Form();
pdfForm.BindPdf(mem);
pdfForm.FlattenAllFields();
pdfForm.Save(mem);
var result = mem.ToArray();

We were unable to notice any issues. We used 22.9 version of the API. Could you please provide a sample application in .zip format for our reference that we can use to reproduce the issue?

1 Like

Good afternoon @asad.ali

I am working on a redaction method and just have a couple of questions. Is it possible to redacts a file with multiple pages i.e I have two pages and I want to redacts a certain area on both pages and more pages if needed. So far I have been able to handle a single page redaction from the api requests but unable to do multiple pages.

Thank you and please let me know if you need more clarifications.

@usz.a

Can you please share a sample file and code snippet to further elaborate why you are unable to redact PDF with multiple Pages?

In the above line of code, you can simply specify the Page (it can be any page in PDF document) in order to add redaction annotation into.

@asad.ali

The code works fine but my question is that is it possible to redact two pages simultaneously? For instance I have a pdf file with two pages, to redact the first page I am using the code snippet you have above, but how would you approach doing the second page without having to pass in your file again and specific that the page you need to redact is page 2

@usz.a

You need to pass the file only once. After that, you can iterate through pages to add redaction annotation. For example:

var doc = new Document("input.pdf");
foreach(var page in doc.Pages)
{
 // Add redaction annotation code snippet to add it on page level
}

Hello @asad.ali, thank you for your continuous help.

I am working on a flatten pdf method in c# but I am having the issue where the checkboxes on the pdf file are still interactable and you can fill with textboxes as well with the Adobe fill & sign function. Is there something we can do about that.

 public flattenPdfResponse FlattenPdf(flattenPdfRequest request)
    {

        byte[] byteArray = Convert.FromBase64String(request.encodedFileString);

        MemoryStream mem = new MemoryStream();
        mem.Write(byteArray, 0, byteArray.Length);

        Aspose.Pdf.Facades.Form pdfForm = new Aspose.Pdf.Facades.Form();

        pdfForm.BindPdf(mem);
        pdfForm.FlattenAllFields();

        pdfForm.Save(mem);

        return new flattenPdfResponse(mem.ToArray(), string.Empty, HttpContext?.User?.Identity?.Name ?? "");

    }

Here is my code snippet. The request.ecndoedFile string is just a encoded pdf file.

I would appreciate it if you can look into this.

Thank you.

@usz.a

Would you please share your sample PDF with us as well? We will test the scenario in our environment and address it accordingly.

Witness Statement - Scanned Sample.pdf (95.8 KB)
@asad.ali Here’s a sample of the pdf file as requested

@usz.a

Are you sure that this PDF document has form fields in it? We checked it and it looks like to have only an image in it.