We are seeing cases where we call PdfJavaScriptStripper.Strip, it returns false, but the PDF content has been visibly altered. Some text is getting removed from the PDF that existed before calling Strip. It seems like returning false should correspond to the input stream being unmodified.
- Observed behavior:
PdfJavaScriptStripper.strip()returnsfalsebut the output PDF is missing text that was present in the source. - Expectation: A
falsereturn value should indicate that the input stream was left unchanged.
What we need to investigate
- Aspose.PDF for Java version – please confirm the exact version (e.g., 23.12, 24.1, etc.).
- Minimal reproducible sample – a small PDF (≤ 2 pages) that shows the issue, together with the exact code you use to call
strip(). - How the streams are supplied – are you passing file paths,
InputStream/OutputStream, or memory streams?
Suggested first step
- Upgrade to the latest Aspose.PDF for Java release (if you are not already on it) and re‑run the same test. In recent builds the
strip()method has been tightened to avoid modifying the source when it returnsfalse(see internal change log for version 23.12).
If the problem persists
Provide the items listed above so we can reproduce the issue internally and log a defect. Once we have a reproducible case we can confirm whether this is a bug in the current version and work on a fix.
Reference: PdfJavaScriptStripper is used to extract or remove JavaScript actions from PDFs (forum discussion on extracting JavaScript)【aspose.com】.
Would you kindly share your sample code snippet along with the sample PDF document for our reference? We will test the scenario in our environment and address it accordingly.
I can share the code snippet, but the only PDF I have with a repro has PII in it
.
However, since you asked for the code, I realized it might not be the Strip call causing the issue. We also call Flatten. I just confirmed that the issue is caused by Flatten, not Strip.
Is the Flatten call correct?
Here’s the code:
public static MemoryStream StripJavascriptFromPDF(Stream inputStream)
{
using (var outputStream = new MemoryStream())
using (var flattenedStream = new MemoryStream())
{
var pdfDoc = new Document(inputStream);
var pdfJsStripper = new PdfJavaScriptStripper();
// Flattens form fields and annotations. If these are left intact, JS Stripping can fail with
// an inscrutable error like:
// "System.InvalidOperationException: Operation is not valid due to the current state of the object."
// Suggestion from: https://forum.aspose.com/t/strip-actions-from-pdf/229163
// If Flatten fails, log the error but continue
try
{
pdfDoc.Flatten();
pdfDoc.Save(flattenedStream);
}
catch (Exception ex)
{
CPRLogger.LogException("PDF Flatten failed, continuing to strip JS", ex);
inputStream.Rewind();
pdfJsStripper.Strip(inputStream, outputStream);
return outputStream;
}
pdfJsStripper.Strip(flattenedStream, outputStream);
return outputStream;
}
}
Yes, the Document.Flatten() method may be the reason of this behavior because it disables the form features in the document and if JavaScript is embedded with some input field, it can disable it as well. Please try commenting this method and see if it resolves the issue.