Downscale images in PDF file

Hi,


I downloaded the last version of Aspose.Pdf (with a temp license for 30 days) for .Net and I’m trying to see if this library can accomplish our requirements. Basically there are 2 main requirements, split and optimize. I was able to do both of them, but when testing with several files, I saw that the final Pdf file it’s almost the same size than the source file. The source file is around 26M with 8 pages. I split the source file having 4 files with 2 pages in each file, and each file is 26M. In the code before saving the document I call the method OptimizeResources.

I spent some time searching in the forum if there is a way to downscale the images (some options similar to Acrobat Pro Optimizer ), but I couldn’t find a post to do it directly. The only reference was a link to extract the image and then replace it. During this process you can save the XImage in a stream and specify a resolution. With this “workaround” the final file is around 2.5M. I did the same, but without specifying a resolution and I got the same file size. So, it seems to be that the “trick” is to re-attach the image, doing this, the file (besides calling OptimizeResources) goes from 26M to 2.5M. Is something “missing” during the optimization process? or is there other path that I could use?

Here is the code:

using (MemoryStream stream = new MemoryStream(File.ReadAllBytes(fileName)))
{
Document pdfDocument = new Document(stream);
int totalPages = pdfDocument.Pages.Count;
int pageCount = 1;
for (int i =1; i<=totalPages;)
{
Document newDocument = new Document();
//This loop is for extracting a range of consecutive pages specified in numPages (split)
for (int j = 0; j < numPages; j++)
{
Page pdfPage = pdfDocument.Pages[i];
newDocument.Pages.Add(pdfPage);
i++;
//control page out of index
if (i > totalPages)
{
break;
}
}

foreach (Page pdfPage in newDocument.Pages)
{
int totalImages = pdfPage.Resources.Images.Count;
//Iterate over all images in each page
for (int m = 1; m <= totalImages; m++)
{
XImage image = pdfPage.Resources.Images[m];
{
using (MemoryStream imageStream = new MemoryStream())
{
image.Save(imageStream, resolution);
pdfPage.Resources.Images.Replace(m, imageStream);
}
}
}
}
newDocument.OptimizeResources();

//Save newDocument
}
}


Regards,

Luis.

Hi Luis,


Thanks for using our products.

The OptimizeResources(…) method eliminates duplicate or unused objects from PDF file and in case the document is mainly comprised of image files, it better to downscale the image resolution. The approach which you have stated above is correct for optimizing the size of image based PDF document. In case you encounter any issue or you have any further query, please feel free to contact.