We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Corrupted PDF after compression

Hello, I need to compress existing PDFs that any user of my system can provide. Not necesarily PDFs that were created using Aspose.


Using the example code in this link: http://www.aspose.com/docs/display/pdfnet/Optimize+PDF+File+Size is working pretty well, except that there are cases where I end up with a corrupted PDF, which is no good.

So the questions are:

1) What is causing a valid PDF file to get corrupted after compressing it with Aspose?

2) Is there any way to tell if a PDF file is corrupted? (that way I can compress it and check if the result is ok)

3) Anything to look for on the file that will most likely cause a corrupted output?

Thanks for your time.
Gonzalo

Hi Gonzalo,


Thanks for your inquriy. Please share your sample problematic document and code here, we will test the scenario at our end and will guide you accordingly.

Furthermore, following code will help you to corrupt PDF documents.

private static bool IsCorrupt(string path)<o:p></o:p>

{<o:p></o:p>

try<o:p></o:p>

{ Document doc = new Document(path); }<o:p></o:p>

catch (Exception ex)<o:p></o:p>

{ return true; }<o:p></o:p>

return false;<o:p></o:p>

}<o:p></o:p>


We are sorry for the inconvenience caused.

<o:p> </o:p>

Best Regards,
Thanks for your response.
Attached is the file and below is the code from a console application I created to replicate the issue:

static void Main(string[] args)
        {
            var folderPath = @"C:\test\";
            var fr = File.ReadAllBytes(System.IO.Path.Combine(folderPath, "FiOS_is_Here_(Don_Meyers).pdf"));
        <span style="color:blue;">using</span> (<span style="color:blue;">var</span> ms = <span style="color:blue;">new</span> <span style="color:#2b91af;">MemoryStream</span>(fr))
        {
            <span style="color:blue;">var</span> d = <span style="color:blue;">new</span> Aspose.Pdf.<span style="color:#2b91af;">Document</span>(ms);

            <span style="color:blue;">var</span> oo = <span style="color:blue;">new</span> Aspose.Pdf.<span style="color:#2b91af;">Document</span>.<span style="color:#2b91af;">OptimizationOptions</span>();
            oo.LinkDuplcateStreams = <span style="color:blue;">true</span>;
            oo.RemoveUnusedObjects = <span style="color:blue;">true</span>;
            oo.RemoveUnusedStreams = <span style="color:blue;">true</span>;
            oo.CompressImages = <span style="color:blue;">true</span>;
            oo.ImageQuality = 40;                

            d.OptimizeResources(oo);

            <span style="color:blue;">using</span> (<span style="color:blue;">var</span> compressedMs = <span style="color:blue;">new</span> <span style="color:#2b91af;">MemoryStream</span>())
            {
                <span style="color:green;">//this method corrupts the PDF</span>
                d.Save(compressedMs, Aspose.Pdf.<span style="color:#2b91af;">SaveFormat</span>.Pdf);
                <span style="color:#2b91af;">File</span>.WriteAllBytes(System.IO.<span style="color:#2b91af;">Path</span>.Combine(folderPath, <span style="color:#a31515;">"savedToStream.pdf"</span>), compressedMs.GetBuffer());

                <span style="color:green;">//saving directly to file generates a valid pdf</span>
                d.Save(System.IO.<span style="color:#2b91af;">Path</span>.Combine(folderPath, <span style="color:#a31515;">"savedToFile.pdf"</span>));
            }

        }

        


    }</pre><pre style="background: white;"><br></pre><pre style="background: white;"><font face="Times New Roman">Notice that I'm saving the optimized document to a stream because I need the file bytes to store them into the database. </font></pre><pre style="background: white;"><font face="Times New Roman">If I save the document directly to a file on the hard drive it produces a valid PDF (with a different size than the other).</font></pre><pre style="background: white;"><font face="Times New Roman"><br></font></pre><pre style="background: white;"><font face="Times New Roman">Regards,</font></pre><pre style="background: white;"><font face="Times New Roman">Gonzalo</font></pre></div>

Hi Gonzalo,


Thanks for sharing the code and sample document. I have tested your shared code and noticed the reported issue with stream, so logged a ticket PDFNEWNET-39983 in our issue tracking system for further investigation and resolution. We will keep you updated about the issue resolution progress.

However, I have noticed that a valid PDF document is being saved into stream but it is getting corrupt at the time of saving to file. You can try following code snippet to test the scenario it is working fine.

…<o:p></o:p>

d.Save(compressedMs,SaveFormat.Pdf);

compressedMs.CopyTo(new FileStream(System.IO.Path.Combine(myDir, "savedToStream.pdf"), FileMode.Create));

//File.WriteAllBytes(System.IO.Path.Combine(myDir, "savedToStream.pdf"), compressedMs.GetBuffer());

.....


We are sorry for the inconvenience caused.

Best Regards,

Hi Tilal, thanks for your assistance.

I’ve actually figured out what the problem was. It has nothing to do with your Pdf component, it’s the GetBuffer method of MemoryStream.

It returns the buffer the stream is using, but the buffer doesn’t necesarily has the same size as the file, sometimes it’s bigger.
So when I was saving that buffer, empty bytes were saved in the file and caused the corruption.

The trick was to use MemoryStream.ToArray method which will return just the bytes you need.

Thanks again for your time.

Gonzalo

Hi Gonzalo,


Thanks for your feedback. It is good to know that you have managed to resolve the issue, so we will close above logged ticket.

Please keep using our API and feel free to contact us for any further assistance.


Best Regards,