Free Support Forum - aspose.com

Aspose.Words modifies bitmap image when exporting into xml for Aspose.Pdf

Hi,

i’m using mailmerge to insert some png images into a word template, exporting it into xml format for aspose.pdf, converting it to pdf and then use ghostscript to produce a postscript file wich i send to a printer.
the problem is, that aspose.words somehow modifies theese png images (the generated temporary copys of the image files are bigger) and this modification causes, that the resulting pdf is bigger and the pdf-to-ps transformation performs extremly bad (it takes ages and produces huge ps file). if i replace the temporary image files with the original one, everything works fine.

please see the attached files:

test.dot - test template
encode1.png-encode10.png - original png files
Aspose.Words.001.png-Aspose.Words.010.png - modified png files produced by aspose words
test1.xml - xml with pngs generated by aspose.words
test2.xml - xml with original pngs
test1.pdf - pdf generated from test1.xml
test2.pdf - pdf generated from test2.xml
test1.ps - ps generated from test1.pdf
test2.ps - ps generated from test2.pdf

cheers
Robert

Thank you for reporting this problem to us. Please give me one day, I need to discuss this issue with the rest of the team.

Best regards,

I see that the .png image created by Aspose.Words is 3.45kb whereas the original .png image is 1.06kb. The "correct" PDF file is 18.1kb and "incorrect" PDF file is 27.8kb. So far I would not have thought there is a big problem. But the difference between the sizes of the PS files is astounding: 62.3kb vs 3.74MB!

I agree that the PNG file exported by Aspose.Words in this case is different from the original. But all aspects of the original and produced file are the same: they are both 300dpi, 32bit color, 770x49 pixels and they match pixel by pixel.

The reason that the produced PNG files are different from the original is because Aspose.Words loads them into a .NET Image object during mail merge and then uses Image.Save to save the image into a stream inside the DOC file. So .NET PNG decoder and encoder perform this conversion and this produces the file that is different from the original.

What I'm saying is that if the fault is inside the PNG files produced by Aspose.Words, then the fault is really in the .NET Framework PNG encoder. This is of course an issue for Microsoft to deal with.

What I'm also saying is that the .PNG files produced by .NET (and by Aspose.Words) might be "correct", it might just be using different compression parameters of whatever and in this case produces a file that is 3kb in size instead of 1kb. I'm saying that the actual fault might be further down the line. Most likely in the tool that you use to convert from PDF to PS that simply does not understand that type of compression or whatever. It is unlikely that the fault is in Aspose.Pdf but it is also possible, maybe it is storing these PNG files in some way that again your ghostcript does not properly understand.

So far it looks there is no easy way we can fix something from our side. We don't really produce PNG files - we ask .NET Framework to do it. If it does it in such a way that ghostcript does not undertand them - we are somewhat stuck. One way is for us to start using a new image format library (not likely at this stage). Another way I can try to see if I can store the PNG file directly inside the DOC file during mail merge without going through the Image object. I cannot guarantee it will work, but I will try and if this works, it will come out in the next release end of August.

Hi Roman,

thank you for your comment. I have digged deeper into this issue. I have looked at the inner working of the Image object. It seems that this object loads and saves the image using pinvoke to gdiplus api , that means it uses windows png codec to load and save the image. This codec seems to alter the information (i think it converts a grayscale palete into a color palete and maybe also some other nasty things).
Interestingly, i see, that if i put the image directly into the template, the temp-image is then the original image, so you use the image object only for mailmerge? Could you store the mail-merged image the same way as if it were embeded in the template?
Anyway, for now i have found a way to force the image object to save the original raw data. it is a dirty hack (using reflection), but it works for me right now.
I have also posted to ghostcript team, but i do not know if they are willing to invest time into this issue… we will see.

Thank you

Robert

Your observation is correct. When just opening and saving a document we don’t force the images through .NET Image objects. But during mail merge we do so and I was going to review this whether this is absolutely necessary or not.

Fixed in Aspose.Words 4.0. Mail merge now does not require the Image object.