PDF Redaction is not working properly with Aspose.PDF 18.10.0
Problem Statement:
- Facing challenges while redacting content from pdf file.
- For your reference, created a console application to reproduce this issue using Aspose.Pdf 18.10 and attached here.
Steps to perform:
- The console application does not contain the license file (Aspose.Total.lic). So, you can set the license on your own.
- Sample PDF(Pdf\PdfConsoleApp.pdf) is also available in console application.
-
Build the solution since the attachment does not contain the bin\Debug folder.
-
Run the console application.
-
It throws an exception from page 21. Here are the exception (System.ArgumentOutOfRangeException) message and the stack trace:
a) Exception message: “Index was out of range. Must be non-negative and less than the size of the collection”
Stack trace: “at System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource)\r\n at Aspose.Pdf.Text.TableAbsorber.\u0002(Page \u0002)\r\n at Aspose.Pdf.Text.TableAbsorber.\u0002(Page \u0002)\r\n at \u000f ??.\u0002(Page \u0002, Rectangle \u0003, Rectangle& \u0005)\r\n at \u000f ??.\u0002(TextFragment \u0002, Double \u0003)\r\n at \u000f ??..ctor(TextFragment \u0002, Double \u0003)\r\n at Aspose.Pdf.Text.TextFragment.set_Text(String value)\r\n at RedactionAsposePDF.TextRedaction.RedactKeywords(Stream fileStream, List`1 keywords, String fileName) in E:\SourceCode\Content Analysis\Proofs of Concept\Console App for PDF Redaction\RedactionAsposePDF\TextRedaction.cs:line 40\r\n at RedactionAsposePDF.Program.PdfTextRedaction() in E:\SourceCode\Content Analysis\Proofs of Concept\Console App for PDF Redaction\RedactionAsposePDF\Program.cs:line 25\r\n at RedactionAsposePDF.Program.Main(String[] args) in E:\SourceCode\Content Analysis\Proofs of Concept\Console App for PDF Redaction\RedactionAsposePDF\Program.cs:line 10” -
On further investigating, found that in TextFragmentAbsorber.TextFragments (TextFragmentAbsorber.Text is empty), the TextFragmentAbsorber.TextFragments.Count is present even though the word does not exist in that page (If the word exists, the count is more than the actual count). And when it tries to replace the word, the word is not found. Hence, the exception.RedactionAsposePDF.zip (104.7 KB)