Splitting PDFs

When i split PDFs i am finding that the file size is dramatically bigger. A 5Mb multipage PDF ends up as 30Mb worth of single page PDF’s. I’ve tried using .Optimize() and .OptimizeResources(); Here is a code snippet:


var pdf = new Document(inputFilename);
var newPdf = new Document();
var currentPage = pdf.Pages[page];
newPdf.Pages.Add(currentPage);
newPdf.Save(outputFilename);

I assume this is because the newPdf i create lacks the compression of the original. Is there any way to fix this?

Hi there,


Thanks for your inquiry. Please optimize the resource before saving resultant document. Hopefully it will solve the issue. If issue persist then please share your source document? So we will test the scenario and will provide you information accordingly.

var pdf = new Document(inputFilename);
var newPdf = new Document();
var currentPage = pdf.Pages[page];
newPdf.Pages.Add(currentPage);
newPdf.OptimizeResources();
newPdf.Save(outputFilename);

Best Regards,

As i mentioned in my original question I have tried OptimizeResources() unfortunately it has no effect. If i output all the pages as single page PDF’s i end up with a folder 5-6x the size.


How come there is no compression option when saving a page as a new pdf?


Hi Rupert,


Thanks for sharing the resource file.

I
have tested the scenario and I am able to reproduce the same problem. For the
sake of correction, I have logged it in our issue tracking system as PDFNEWNET-35353. We
will investigate this issue in details and will keep you updated on the status
of a correction. <o:p></o:p>

We apologize for your inconvenience.

Hi Rupert,


Thanks for your patience.

We have further investigated the issue PDFNEWNET-35353 reported earlier and as per our observations, the problem seems to be related to the code snippet which are using to split the document to single page documents. In your code, first you create empty document and then in for loop, you are adding first page of the document to the SAME document. Therefore first saved document has one page, second has two pages… and so on, and 58s document has 58 pages each of them is copy of 1st page of the initial document and size ~400K.

I would suggest you to please try using the following code snippet to accomplish your requirement.

[C#]

Document pdf = new
Document(“c:/pdftest/00000076V00.pdf”);<o:p></o:p>

for (int counter = 1; counter <= pdf.Pages.Count; counter++)

{

Document newPdf = new Document();

Page currentPage = pdf.Pages[counter];

newPdf.Pages.Add(currentPage);

newPdf.OptimizeResources(/*new Document.OptimizationOptions() { RemoveUnusedStreams = true }*/);

newPdf.Save("c:/pdftest/00000076V00_35353-" + counter + ".pdf");

}


Furthermore, the OptimizeResources method is improved to support elimination of unused color spaces. This allows to achieve lowest file size in customer case. Also please note that we can't expect that after splitting file into pages, total size of "Single-page" documents will be lower then source document. In some cases, it can be greater than source PDF file.

For example, if document with 100 pages contains shared resource and it's size is 100K, and this resource is used on every page of the document, then after splitting, every "one-page" document will contain this resource separately and total size will at least on 99 * 100k = 9900k greater. (In fact it will be even more greater because every of separated documents has own PDF document structure which also takes some place in the file).

The issues you have found earlier (filed as PDFNEWNET-35353) have been fixed in Aspose.Pdf for .NET 8.2.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.