Using the following code below on attached input PDF, Aspose ends up corrupting the PDF. We would like to rely on this Optimize logic in Production, but it seems like it is not viable (unless I’m missing something). It also takes over 90 seconds to process (only to fail).
I have narrowed it down to “CompressObjects = true”. If I set that to false, it works, but the file size is larger.
var sw = Stopwatch.StartNew();
using (var document = new Aspose.Pdf.Document(pdfPath))
{
var optimizeOptions = new Aspose.Pdf.Optimization.OptimizationOptions()
{
AllowReusePageContent = false,
CompressObjects = true,
LinkDuplicateStreams = false,
RemoveUnusedObjects = true,
RemoveUnusedStreams = true,
};
document.OptimizeResources(optimizeOptions);
document.Save(tempOutputPath);
}
using (var info = new PdfFileInfo(tempOutputPath))
{
if (!info.IsPdfFile)
{
Console.WriteLine($"Error after {sw.Elapsed.TotalSeconds:N2} seconds");
return;
}
}
var inputFileSize = new System.IO.FileInfo(pdfPath).Length;
var optimizedFileSize = new System.IO.FileInfo(tempOutputPath).Length;
if (optimizedFileSize >= inputFileSize)
{
Console.WriteLine($"File size larger, reverting after {sw.Elapsed.TotalSeconds:N2} seconds...");
return;
}
Console.WriteLine($"Success after {sw.Elapsed.TotalSeconds:N2} seconds!");
FakePassportFormFilled.pdf (2.9 MB)
@markmanley.adlib
Summary
Calling OptimizeResources with CompressObjects = true on the provided PDF can produce a corrupted file and takes a long time (≈ 90 s). The PDF is encrypted and contains XFA‑based form data, which makes the full‑object compression path fragile and CPU‑intensive.
Why it happens
-
Encryption & XFA – When a PDF is encrypted Aspose.PDF must decrypt the streams before any compression can be applied. For PDFs that also contain XFA forms, the de‑encryption step can leave some objects in a state that the CompressObjects routine cannot rewrite, leading to corruption.
-
Complex object graph – The document has many reusable page contents, duplicate streams and form‑field objects. Enabling CompressObjects forces a re‑encoding of every stream, which is both memory‑ and CPU‑heavy. On a relatively small (≈ 3 MB) file this can exceed internal limits and cause the observed issues.
-
Version‑specific behavior – Certain older releases of Aspose.PDF for .NET are known to be less tolerant of the combination “encrypted + XFA + CompressObjects”. Newer builds contain improvements, but the exact behavior depends on the version you are using.
What to do next
- Check and, if possible, upgrade the Aspose.PDF library
// NuGet: Install-Package Aspose.Pdf
// Use the most recent stable version (e.g., 23.12 or later)
- Decrypt the PDF before optimization (if it is password‑protected)
var loadOptions = new Aspose.Pdf.LoadOptions { Password = "yourPassword" };
var doc = new Aspose.Pdf.Document(pdfPath, loadOptions);
- Apply a safer set of optimization options
var opt = new Aspose.Pdf.Optimization.OptimizationOptions
{
AllowReusePageContent = true, // reuse already compressed content
CompressObjects = false, // avoid full‑object recompression for encrypted/XFA PDFs
LinkDuplicateStreams = true, // merge identical streams
RemoveUnusedObjects = true,
RemoveUnusedStreams = true,
CompressionLevel = Aspose.Pdf.Compression.CompressionLevel.Best
};
doc.OptimizeResources(opt);
doc.Save(outputPath, Aspose.Pdf.SaveFormat.Pdf);
- If you still need the extra size reduction that
CompressObjects provides, try the PdfDocumentOptimizer class
var optimizer = new Aspose.Pdf.Optimization.PdfDocumentOptimizer
{
Compress = true, // enables Flate compression
RemoveUnusedObjects = true,
RemoveUnusedStreams = true
};
optimizer.Optimize(doc);
doc.Save(outputPath, Aspose.Pdf.SaveFormat.Pdf);
- Consider flattening the XFA form before optimization (converts the dynamic XFA content to static PDF content)
doc.Form.Flatten(); // makes the form static
doc.OptimizeResources(opt);
doc.Save(outputPath, Aspose.Pdf.SaveFormat.Pdf);
-
Validate the resulting file (e.g., using PdfFileInfo or opening it in a viewer) to ensure it is not corrupted and that the size reduction meets your expectations.
-
If the problem persists after trying the steps above, capture any exception details and a small excerpt of the corrupted output, then raise an internal follow‑up request with the Aspose.PDF team, attaching the PDF and the code snippet. This will allow the product team to investigate the specific scenario.
Quick checklist
- [ ] Verify the Aspose.PDF for .NET version and upgrade if needed.
- [ ] Decrypt the PDF (provide the password) before calling any optimization.
- [ ] Use
CompressObjects = false and enable LinkDuplicateStreams = true.
- [ ] Optionally use
PdfDocumentOptimizer or flatten XFA forms for additional size reduction.
- [ ] Validate the output file with
PdfFileInfo or a PDF viewer.
Follow these actions and let us know the exact library version and any exception messages you encounter; we’ll continue the investigation.
I could not fully verify this explanation against the retrieved sources. Please double-check the behaviour in your environment and confirm the exact SDK/version.
- We have latest 25.10 (.NET)
- Decrypt is not an option as we need to maintain security (other tools are able to optimize without removing security)
- Flattening is not an option
My main concern is that Aspose corrupts the PDF without throwing any exception. Performance is a factor too. What can we detect ahead of time to know not to add CompressObjects, or do we always have to “test” if Aspose corrupted the PDF after every Optimize??
@markmanley.adlib
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-61224
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.