Problem with OptimizeResources and ImageCompress

Hi,

I have a large volume of pdf documents which I must optimize the size.
I tested with many APIs and get the best results via Aspose.pdf.

However in some files pages become unreadable after compression of images.
I use OptimizeRessources method with the following parameters :

var optimizeOptions = new OptimizationOptions();
optimizeOptions.RemoveUnusedObjects = true;
optimizeOptions.RemoveUnusedStreams = true;
optimizeOptions.LinkDuplcateStreams = true;
optimizeOptions.AllowReusePageContent = true;
optimizeOptions.ImageCompressionOptions.CompressImages = true;
optimizeOptions.ImageCompressionOptions.ImageQuality = 50;
optimizeOptions.ImageCompressionOptions.ResizeImages = true;
optimizeOptions.ImageCompressionOptions.MaxResolution = 150;
docPdf.OptimizeResources(optimizeOptions);

By disabling CompressImages, I have my documents again fully visible but we lose all the size gains.

Are there particular parameters or method to use or is it normal?

Tested on AsposePdf 19.01, 19.02 and 19.03.
Some examples and their rendering after treatment :

50767_page-2.pdf (50.0 KB)
50767_page-2_optimize.pdf (43.9 KB)
117721_page-2_optimize.pdf (216.7 KB)
117721_page-2.pdf (314.6 KB)
125314_page-181.pdf (225.1 KB)
125314_page-181_OPTI.pdf (220.9 KB)

@sriviere.sogedi.fr

Thanks for contacting support.

We were able to replicate the issue(s) in our environment while using Aspose.PDF for .NET 19.3 and logged them separately for you files in our issue tracking system as follows:

  • PDFNET-46227 (50767_page-2.pdf)
  • PDFNET-46228 (117721_page-2.pdf)
  • PDFNET-46229 (125314_page-181.pdf)

We will further look into details of the logged issue and keep you posted with the status of their correction. Please be patient and spare us little time.

We are sorry for the inconvenience.

Hello,

Have you identified the problem?
We are waiting for your response to determine if we will take the aspose.pdf license or if we need to turn to another solution.

@sriviere.sogedi.fr

We regret to share that earlier logged issues are not yet resolved. Please note that we resolve issue on first come first serve basis and have been busy resolving other issues logged prior to yours. However, we have recorded your concerns and will definitely consider them during investigation. We will surely let you know in case further updates are available in this regard. Please spare us little time.

We are sorry for the inconvenience.

Hello,

Have you had time to look at my problem ?

Regards,

@sriviere.sogedi.fr

We regret to share that your issues are not yet resolved due to other high priority issues and on-going development for new features. However, we have taken your concerns into account and will definitely try to schedule the investigation against your logged issues. We will provide you an update as soon as we have some regarding issue resolution. Please spare us little time.

Hello,

Since the last time (6 month !), we bought the license of Apsose.Pdf and update with 19.9 version but same problem still occurs.
Can you solve it please ?

Thank you,

@sriviere.sogedi.fr

We really apologize for the inconvenience you have been facing.

Your issues are sadly not yet resolved due to high priority implementations and issues. Due to free support model, the priority of your issues are low and they will be resolved on first come first serve basis. Despite that, we will surely consider your concerns during scheduling the fixes against reported issues. We greatly appreciate your patience and comprehension in this regard. Please spare us little time.

We are sorry for the inconvenience.

Dear,

We are facing the same issue, since this is very important for our business and we hold multiple bulk licenses, can we get an ETA or direction when this will be picked up as this is quickly becoming a priority here?

Kind regards,
Jonathan

@jverwilghen

As shared in our earlier response, the priority of these issues are low and they will be handled on first come first serve basis. However, we will definitely take care of your concerns during investigation of these issues. In case these issues are blockers or showstoppers for you, you may please try checking our priority support options where issues are resolved on urgent basis. We will surely inform you as soon as we have some updates regarding issue fix. Please spare us little time.

We are sorry for the inconvenience.

Hello,

We found a temporary solution to circumvent the problem (pending an unlikely correction of support).
In prevention we pre-process the document by making a copy of each image and replacing the original by the copy. And only then we call the optimization method.

for(int ipage = 1; ipage <= doc.Pages.Count; ipage++)
{
	for(int iimage = 1; iimage <= doc.Pages[ipage].Resources.Images.Count; iimage++)
	{
		var streamImage = new MemoryStream();
		try
		{
			doc.Pages[ipage].Resources.Images.Replace(iimage, streamImage);
			doc.Pages[ipage].Resources.Images[iimage].Save(streamImage, ImageFormat.Jpeg);
		}
		catch (Exception e)
		{
			 Console.WriteLine("Error from page {0}. Image {1} : {2}", ipage, iimage, e.Message);
		}
		finally
		{
			streamImage.Close();
		}
	}
	doc.Pages[ipage].FreeMemory();
}
...
doc.OptimizeResources(options);

It works for us even if it’s much slower.

Regards,

@sriviere.sogedi.fr

It is really good to know that you were able to resolve your issue by shared workaround. Please keep using our API and in case we have some updates regarding issue resolution, we will surely share within this forum thread.

The issues you have found earlier (filed as PDFNET-46227) have been fixed in Aspose.PDF for .NET 20.2.

A post was split to a new topic: Add MaxResolution option in Aspose.PDF Cloud

Just wanted to add. As of version 21.10, we got image corruption on some documents. The workaround above works. By replacing the images again. Source PDF is from Aspose.Words 21.5 (with optimization already in place there when saving). Then using Aspose.PDF OptimizeResources with ImageCompression, this issue occurs. The images in question are small 80x60 “icon” images.

@SteinGT

Can you please share your sample source PDF along with corrupted output PDF obtained from Aspose.PDF? Also, please share the complete code snippet to replicate the issue so that we can test the scenario in our environment and address it accordingly.

PS: Please also share the code snippet to replace the image that you are using as workaround.

Pretty much limited on time to create the full scenario, but the workaround is the same as above, though a little modified:

private static void FixImagesBecauseOfAsposeBug(ITaskInstance task, Document doc)
{
    var sw = Stopwatch.StartNew();
    task.Log.Trace($"Workaround fix for Images.");
    for (int ipage = 1; ipage <= doc.Pages.Count; ipage++)
    {
        for (int iimage = 1; iimage <= doc.Pages[ipage].Resources.Images.Count; iimage++)
        {
            try
            {
                var img = doc.Pages[ipage].Resources.Images[iimage];

                var streamImage = new MemoryStream();
                img.Save(streamImage);

                doc.Pages[ipage].Resources.Images.Replace(iimage, streamImage);
            }
            catch (Exception e)
            {
                task.Log.Error($"Error from page {ipage}. Image {iimage} : {e.Message}");
            }
        }
        doc.Pages[ipage].FreeMemory();
    }
    task.Log.Trace($"Workaround fix for Images DONE. {(int)sw.ElapsedMilliseconds}msk");
}

If I get time to isolate and create a sample I’ll add it here.

@SteinGT

Thanks for sharing the sample code snippet. Please note that we also need a sample PDF document along with sample images which you are trying to replace in the source PDF. This way we will be able to replicate the similar issue at our end and address it accordingly.