Hi Nayyer
The goal is to be able to merge 1000+ pdf document into one pdf, each pdf document are different and can be of any format and pages. That is reason why my for loop has a new instance of pdf document for every document in temp folder.
Also while you are at it, can you also look into the concatenate and append method in pdffileeditor class they too generate huge resultant pdf.
Please share the update or ETA on this fix. i have been waiting for this fix for a long time.
Appreciate it.
Thanks Nayyer
For the concatenate, i used the same code that you used. And it generates 80MB file which is very huge considering its only 6000page pdf.
Thanks
Hi Nayyer,
I have attached a sample project, here you will find 2 ways of generating merge pdf. Once is appending each sample word document as one huge word document then converting to a pdf file. Another way i used both Word and Pdf to create one pdf document that is convert each word document to pdf then merge all pdf to one pdf document. here you will see that the 2 generated pdf file has huge difference in file size, so that is the reason i think there is huge room of improvement to reduce the pdf file created by aspose pdf. May be i am wrong and is not possible but do research and share your finding.
Hello Ujjwal,
Thanks for sharing the sample application.
I have tested the scenario while using Aspose.Words for .NET 10.8.0 and Aspose.Pdf for .NET 6.7.0 and as per my observations, the PDF file generated using Aspose.Words is 263KB and the resultant PDF being generated with Aspose.Pdf for .NET is 845KB. Please note that both products have their own mechanism/technique of generating the PDF files so there is a difference in resultant PDF file size. Also Aspose.Words for .NET is simply rendering merged word file into PDF format through its rendering engine whereas Aspose.Pdf for .NET is combining pages of 10 individual PDF files (each document is 85.7KB) into a single resultant document. However I will further discuss this scenario with development team to see if we can reduce the size of PDF file being generated.
Your patience and comprehension is greatly appreciated in this regard. We are really sorry for this inconvenience.
Hello Ujjwal,
Thanks for your patience.
Our development team has spent sometime to further investigate the reasons of this problem and following are our observations.
The first sample file which you have shared has size ~80 Kb. It has 12 pages and contents of each of these pages is about 2Kb; also this file contains 3 fonts in resources with sizes: 30Kb, 16Kb, and 3Kb (these are sizes of compressed objects).
Please note that all these objects must be included into concatenated file. If you concatenate a file with 80k size for 2000 times, you will get resultant file whose size would be 2000 * 80Kb = 160000 Kb = 160 MB.
Currently we are not entirely certain that the size can be significantly reduced. (If we treat this as the same file as in sample we can reduce size if we share some objects, for example fonts for all copies of file contents in resultant file; but this will not work for different files!)
Concerning to the second example of using 10 files, please note that the file created with Aspose.Words contains 4 fonts which are used by all document pages. However when converting the individual word file into PDF format and then concatenating these files, fonts are included for individual 10 documents as separate objects. This causes differences in sizes.
Thanks Nayyer for researching this in detail. If possible can you check if Aspose.pdf.kit does the same thing, i don't have that dll anymore to test this scenario.
can you please leave this issue open so that your team can research this and see if you can reduce this file size and compress it more for future.
Again, i appreciate the time you looking into this. Thanks
Hello Ujjwal,
I have also tested the scenario where I have tried concatenating 2000 copies of 80KB source Sample+Document1.pdf file and as per my observations with Aspose.Pdf.Kit for .NET 6.0.0, the process took 5 minutes and 38 seconds to generate a resultant 67.7MB file. In another attempt, I have used Aspose.Pdf.Facades to perform the similar task, and the process took around 1 minute and 23 seconds and the resultant file of size 153MB is generated. Product version of Aspose.Pdf for .NET is 6.7.0.
We will definitely consider these findings during the resolution of this problem. Please note that I have already re-opened this issue and as soon as we have made some significant improvement towards the resolution of this problem, we would be more than happy to update you with the status of correction. Please be patient and spare us little time.