PDF Loss Less Compression

Hi,

We have huge data in TB’s mostly in form of scanned PDF.
Our requirement is to compress the data by 20% but it should be loss less i.e. by using a loss less compression algorithm.

Does Aspose provide any JAVA API that can be used for the following tasks:

  • To compress already existing PDF files by 20% using loss less Algorithms.
  • API method to select from different standard loss less algorithms.
  • Option to compare compressed and original PDF files and show changes if any due to compression.

Thanks,
Gorav

Hi Gorav,


Thanks for contacting support.

Aspose.Pdf for Java supports the feature to compress existing PDF files but currently I am not certain about the algorithm being used for compression purposes. However I have requested the development team to share the related information and as soon as I have some further updates, I will let you know.

Meanwhile, you may consider visiting the following link for further details on Optimize PDF file size

Hi Nayyer,

Thanks for replying.

It would be really helpfull if you could get answers for the queries mentioned in my first post in addition to the type of algorithm used, as we have to finalize the tool ASAP.

Meanwhile, I will check the optimize option suggested by you.

Thanks

Gorav Gandhi

+919739481270

link.gauravgandhi:
Option to compare compressed and original PDF files and show changes if any due to compression.
Hi Gorav,

I am afraid the current release of Aspose.Pdf for Java does not support the feature to compare two PDF documents. However, we already have logged this requirement as PDFNEWJAVA-33557 in our issue tracking system. We will further look into the details of this problem and will keep you updated on the status of correction.

Concerning to your query on compression algorithm, I am in coordination with development team and will get back to you soon.

Hi Nayyar,

I have tried Optimize PDF file Size feature as mentioned by you, but the results which I got are completely opposite to the expected results. File size was increased after compression.

Below are the some of the test results:

S.No Orignal file Size
(in KB)
Compressed File Size
(in KB)
1 12.44 13.19
2 36.11 55.42
3 105.67 118.75
4 75.97 94.44

Here is my test code which is almost similar to code snippet pointed by you in your first reply above.

Code Snippet:

com.aspose.pdf.Document doc = new Document("input/Test.pdf");

// optimize the file size by removing unused objects

com.aspose.pdf.Document.OptimizationOptions opt = new Document.OptimizationOptions();

opt.setRemoveUnusedObjects(false);

opt.setRemoveUnusedStreams(false);

opt.setLinkDuplcateStreams(false);

doc.optimizeResources(opt);

// save the updated file

doc.save("output/compressed_test.pdf");

Even I have tried without setting the optimization options but still the result was same i.e. File size was increaded after compression.

So can you please help me out in this and suggest which method to use for compression.

Thanks

Gorav Gandhi

Hi Gorav,

Thanks for your patience.

Please note that in Aspose.Pdf for Java, we are using FlateDecode algorithm for document compression and it is mostly used for textual data. However as per your first post, you need to compress scanned PDF documents so we it would be great if you can please share some sample PDF files so that we can try the compression at our end.

PS, In order to compress the PDF file containing images, you can also reduce the size of images present inside PDF file.

Hi Nayyar,

Thanks for your efforts and reply.

So does it mean Optimize PDF file Size option can only be used for textual PDF's not for Scanned PDF ??

As I mentioned in my last reply (with code snippet) that size of PDF after using Optimize PDF option increased instead of decreasing.

Regards

Gorav Gandhi

Hi Gorav,


The current compression algorithm being used is more effective to PDF files containing text. However for PDF files containing images, we can try compression the file size. As requested earlier, please share your sample PDF files so that we can test the scenario at our end.

Hi Gorav,


Thanks for your patience.

We have further investigated the feature requested earlier to compare two PDF files and in order to accomplish your requirement, we suggest you to please try using the ComparisonApp of our sister company named GroupDocs. In the event of any further query, please feel free to contact.