Free Support Forum - aspose.com

Save HTML to PDF-A_1B does not generate indexed images anymore

Hello.

Till release 18.3, the linked HTML file was saved to PDF-A_1B generating 

a PDF file containing indexed image (where possible). This ensured a good
PDF size.
Release 18.6 does not generate indexed images, thus the PDF is greater.

Is it by design? Is there a way, in PdsSaveOptions, to prefer indexed 

images when possible?

Thank you.

HtmlToPdfA1B.zip (166.0 KB)

@isispapyrusdev1,

“AsposeConverted_18_6.pdf” has a file size of 82KB while “AsposeConverted_18_3.pdf” has a file size of 80KB. Is this 2KB the size difference you are asking about?

Hello Awais.

Thanks for your analysis and question.

Yes, in this case, with a very small sample image for testing convenience, the difference is by 2kB. In real cases, we have experienced changes from 160kB to 220kB. Everything in the PDF file is the same, except the images, that are no more indexed, thus their stream are dramatically bigger.

In the linked example, PDF object 23 0 obj has passed from 7968 to 10175, thus a 20% more. In a file with many indexed images, this is too heavy. I linked an image to show this difference in the PDF image dictionary.

On my side, the point is: what changed the image treatment and how to enable their indexed mode?
For sure not all images can be indexed, but, when possible, it would be better to continue to convert them as indexed, as done in the past.
I’m asking if the new behavior can be moved to the old one via some PdfSaveOptions property or if you can kindly investigate to explain the change and, if possible, to give the previous good conversion behavior.

Might be anything missing or not clear, please let us know.
Thank you for help.

Differences.png (28.8 KB)

@isispapyrusdev1,

Thanks for the additional information. We tested the scenario and have managed to reproduce the same problem on our end. For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSJAVA-1820. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSJAVA-1820) have been fixed in this Aspose.Words for .NET 18.8 update and this Aspose.Words for Java 18.8 update.

A post was split to a new topic: Word to PDF-A does not generate indexed images