Split PDF file by size

Hello,

I have a requirement in my company that PDFs larger than 10MB must be splitted and each part must not exceed 10MB, so for instance if we have a document of 25MB it should be splitted in 3 parts, each one less than 10MB.

I’ve achieved something similar by splitting the document by pages using the method explained here, so for example if the previous document has 50 pages, then split the document dividing by 3 parts, assuming each part should be less than 10MB. This assumption is quite unreal, because pages can have different contents so each part can have a different resulting size…

The real question is: there is any possibility to split a document by size instead of pages? Or at least there is a way to determine the size in bytes of a document which I’m adding the extracted pages to avoid get over the file size limit?

Thanks in advance.

1 Like

Hi there,


Thanks for your inquriy. I am afraid currently Aspose.Pdf does not split a PDF document on the basis of size. You may check following stackoverflow thread to get size of an input stream and share the results. Hopefully it will help you to accomplish the task.


Please feel free to contact us for any further assistance.

Best Regards,
tilal.ahmad:
Hi there,

Thanks for your inquriy. I am afraid currently Aspose.Pdf does not split a PDF document on the basis of size. You may check following stackoverflow thread to get size of an input stream and share the results. Hopefully it will help you to accomplish the task.


Please feel free to contact us for any further assistance.

Best Regards,


This does not help me much. It's ok that Aspose does not support this functionallity (although being a paid library it should be supported...), but I'm wondering how to know the size of a com.aspose.pdf.Document while I'm building it, or the size of one document page.

What I'm doing is more or less this code:
com.aspose.pdf.Document document = new com.aspose.pdf.Document();
for (int page = startPartPage; page < endPartPage; page++) {
//here I want to check the size of "document" object document.getPages().add(originalDocument.getPages().get_Item(page));
}
document.setOptimizeSize(true);
document.save(outFileName);

So, in each iteration I want to check the size of the document object, before adding another page to avoid exceeding the limit of 10MB and, of course, before saving the document to disk.

Tell me how to do it please.

Hi there,


Thanks for your feedback. We have logged an investigation ticket PDFJAVA-36221 in our issue tracking system for further investigation and rectification. We will keep you updated about the issue resolution progress.

We are sorry for the inconvenience.

Best Regards,

Why you created this ticket? I don’t understand…

You know if is posible to get the size (in bytes) of a document while building it, or at least the size of a certaing page…or not?

Hi there,

We have logged ticket for internal discussion. Please note the only way to know the size of the resulting document is to save it to ByteArrayOutputStream, check the size of the resulting stream and proceed accordingly. Please, see code snippet bellow. Hopefully it will help you to accomplish the task.

com.aspose.pdf.Document document = new com.aspose.pdf.Document();
ByteArrayOutputStream stream = new ByteArrayOutputStream();

for (int page = startPartPage; page < endPartPage; page++) {
    // Here I want to check the size of the "document" object
    document.getPages().add(originalDocument.getPages().get_Item(page));
    
    stream.reset();
    document.save(stream);
    
    System.out.println("size: " + stream.size());
}

document.setOptimizeSize(true);
document.save(testdata + "PDFJAVA_36221/out" + version + ".pdf");

Best Regards,