PdfJavaScriptStripper.Strip modifies the PDF content even if it returns false

mloholt · December 10, 2025, 12:33am

We are seeing cases where we call PdfJavaScriptStripper.Strip, it returns false, but the PDF content has been visibly altered. Some text is getting removed from the PDF that existed before calling Strip. It seems like returning false should correspond to the input stream being unmodified.

Professionalize.Discourse · December 10, 2025, 12:34am

@mloholt

Observed behavior: PdfJavaScriptStripper.strip() returns false but the output PDF is missing text that was present in the source.
Expectation: A false return value should indicate that the input stream was left unchanged.

What we need to investigate

Aspose.PDF for Java version – please confirm the exact version (e.g., 23.12, 24.1, etc.).
Minimal reproducible sample – a small PDF (≤ 2 pages) that shows the issue, together with the exact code you use to call strip().
How the streams are supplied – are you passing file paths, InputStream/OutputStream, or memory streams?

Suggested first step

Upgrade to the latest Aspose.PDF for Java release (if you are not already on it) and re‑run the same test. In recent builds the strip() method has been tightened to avoid modifying the source when it returns false (see internal change log for version 23.12).

If the problem persists

Provide the items listed above so we can reproduce the issue internally and log a defect. Once we have a reproducible case we can confirm whether this is a bug in the current version and work on a fix.

Reference: PdfJavaScriptStripper is used to extract or remove JavaScript actions from PDFs (forum discussion on extracting JavaScript)【aspose.com】.

mloholt · December 10, 2025, 1:16am

image.png (1.5 KB)

It continues to reproduce with the latest version.

asad.ali · December 10, 2025, 4:55pm

@mloholt

Would you kindly share your sample code snippet along with the sample PDF document for our reference? We will test the scenario in our environment and address it accordingly.

mloholt · December 10, 2025, 7:23pm

I can share the code snippet, but the only PDF I have with a repro has PII in it .

However, since you asked for the code, I realized it might not be the Strip call causing the issue. We also call Flatten. I just confirmed that the issue is caused by Flatten, not Strip.

Is the Flatten call correct?

Here’s the code:

public static MemoryStream StripJavascriptFromPDF(Stream inputStream)
{
using (var outputStream = new MemoryStream())
using (var flattenedStream = new MemoryStream())
{
var pdfDoc = new Document(inputStream);
var pdfJsStripper = new PdfJavaScriptStripper();

    // Flattens form fields and annotations. If these are left intact, JS Stripping can fail with
    // an inscrutable error like:
    // "System.InvalidOperationException: Operation is not valid due to the current state of the object."
    // Suggestion from: https://forum.aspose.com/t/strip-actions-from-pdf/229163
    // If Flatten fails, log the error but continue
    try
    {
        pdfDoc.Flatten();
        pdfDoc.Save(flattenedStream);
    }
    catch (Exception ex)
    {
        CPRLogger.LogException("PDF Flatten failed, continuing to strip JS", ex);
        inputStream.Rewind();
        pdfJsStripper.Strip(inputStream, outputStream);
        return outputStream;
    }
    pdfJsStripper.Strip(flattenedStream, outputStream);
    return outputStream;
}

}

asad.ali · December 11, 2025, 4:28am

@mloholt

Yes, the Document.Flatten() method may be the reason of this behavior because it disables the form features in the document and if JavaScript is embedded with some input field, it can disable it as well. Please try commenting this method and see if it resolves the issue.