Mail to PDF - inline images shown as Red Cross

Hello.

We are in the procurement stage of the ASPOSE TOTAL licenses. While working on prototyping of some code to convert MSG format to PDF, we came across this article -
https://docs.aspose.com/email/java/saving-a-msg-as-pdf/

Inspired by this, we wrote code to do the conversion in the similar manner. It is showing strange behavior when the code is run in Windows (standalone prototype) as compared to Linux (prototype in test environment on cloud).
The inline images in the mail are not captured and are shown as red cross.
The version of aspose-email used is 21.12

What is also more intriguing is that the article talks about external attachments not part of the conversion.

An email message can contain attachments as well. Since each attachment can be of different media type, Aspose.Email ignores these attachments while converting to MHTML i.e. only inline images in a message will be part of MHTML and any regular attachments will be ignored.

However we have images in the attachment, which got captured and is shown correctly in the converted file. the second image attachment is not shown (perhaps due to restriction in the evaluation version - you could confirm)

Here are the sample snapshots of the converted PDF and the actual mail file used for the tests.

Test.zip (489.5 KB)

@vedjaipraful

Please get 30 days temporary license and apply it before MSG to PDF. Please also use the latest version of Aspose.Email for Java 22.3. If you still face problem, please ZIP and attach the MHTML file generated from MSG using Aspose.Email here for testing. We will investigate the issue and provide you more information on it.

So you think this is related to the evaluation version ? And believe that a licensed version would solve the issue?

What about the attachments which are images…are they supposed to be part of the output file ? As the documentation link provided in the post above talks about the attachments will not be part of the output saved from msg file.

Can u please help clarify these points ?

@vedjaipraful

Please note that in evaluation mode there are some limitations applied. E.g. The document’s contents are truncated after a certain number of paragraphs. So, please get temporary license, apply it, convert MSG to MHTL and share that MHTML here for further testing. Thanks for your cooperation.

@tahir.manzoor
Thanks a lot for this quick update.
Actually I represent an organization, and we are already in discussion with ASPOSE for procuring the ASPOSE.TOTAL java licenses.
So perhaps, going into a different process of getting the temporary licenses and then putting them to use could create some issues.
If you say that this is perhaps due to the evaluation version being used, then I could wait for the actual licenses.

HOWEVER - please note, that when I run a simple program on WINDOWS, it gives me a good output with the correct inline image. But when I use the same code in a test environment on LINUX, I have the red cross image issue.
If you say that the evaluation version could be responsible for this, logically it does not explain the difference in behavior on Windows and LINUX.

Thoughts ?

@vedjaipraful

Please note that MSG to PDF conversion uses Aspose.Email and Apose.Words. Aspose.Email converts MSG to MHTML and Aspose.Words converts MHTML to PDF.

We need the MHTML output that is generated by Aspose.Email for testing. You can generate it without license file. Please save the MSG to MHTML and send it here for testing. Thanks for your cooperation.

Hello @tahir.manzoor

Please find the requisite mhtml file attached here.SampleFiles.zip (532.0 KB)

@tahir.manzoor
Just additional information here in the attached files above…The mhtml is showing the right image without the red cross. It seems the conversion to PDF is not rendered well.
FYI - We have valid license for ASPOSE.words :slight_smile: The ASPOSE.words version in use is - 20.12

@vedjaipraful

Your issue is related to Aspose.Words. So, we have moved this forum thread to Aspose.Words’ forum where you will be guided appropriately.

@vedjaipraful I have checked conversion of your document to PDF using the latest 22.4 version of Aspose.Words for Java and cannot reproduce the problem. So, please try using the latest version on your side and let me know if the problem still persist.

I would like to share that when I use the words 20.12 on Windows standalone program, all works fine. But the same version on Linux is not working fine. So I do not believe it to be related to the usage of the latest version of the words jar

@vedjaipraful Thank you for additional information. I have tested the scenario in Linux Docker and image is rendered properly.

Document doc = new Document("/temp/in.mhtml");
doc.save("/temp/out.pdf");

The JARs I am using are deployed in a docker container. We do not use the docker container for ASPOSE.

This is my code -

FileInputStream fstream = new FileInputStream(sourceFilePath);
MailMessage eml = MailMessage.load(fstream);

// Save the Message to output stream in MHTML format
ByteArrayOutputStream emlStream = new ByteArrayOutputStream();
eml.save(emlStream, SaveOptions.getDefaultMhtml());

String MHTMLFilePath = (outputDir + fileNameWithoutExt + ".mhtml");
eml.save(MHTMLFilePath, SaveOptions.getDefaultMhtml());

// Load the stream in Word document
com.aspose.words.LoadOptions lo = ODCAsposeConfig.getWordsLoadOptions();
lo.setLoadFormat(LoadFormat.MHTML);
doc = new Document(new ByteArrayInputStream(
                emlStream.toByteArray()), lo);

doc.setFontSettings(ODCAsposeConfig.getWordsFontSettings());

warningCallbackWords = new WarningCallbackWords();
doc.setWarningCallback(warningCallbackWords);

if (targetType.compareToIgnoreCase("pdf") == 0)
{
    targetFilePath = (outputDir + fileNameWithoutExt + ".pdf");
    doc.save(targetFilePath);
    LOGGER.debug("DOC Conversion - preview save done");
}

@vedjaipraful I have used Docker to emulate Linux environment. Have you tried using the latest 22.4 version of Aspose.Words for Java on your side in your test environment. I also tested conversion on Mac and still the image is there.

@alexey.noskov
Can you try the same test by using words 20.12 and confirm that the image is visible. I believe that the issue is not related to the version. Like I already mentioned, when the same test is run on Windows standalone program, we see the image correctly. Only when I am using this directly in LInux environment , am I facing the red cross issue.

@vedjaipraful I tested with 20.12 version and still cannot reproduce the issue on my side. Tested on Windows, Linux and Mac - the image is rendered fine.
Maybe the problem might occur because JAI package is not available in your Linux environment:
https://docs.aspose.com/words/java/system-requirements/#optional-dependencies

@alexey.noskov
Thanks for this answer.
I tried a couple of more things before I start by trying the usage of the JAI package.

I tried to save the mhtml to doc and docx format. The output file itself does not have the image. They are shown as red cross image with the message “The linked image cannot be displayed. The file may have moved, renamed, or deleted. Verify that the link points to the correct file and location.”
image.png (953 Bytes)
zipped1004757_161211731.zip (557.3 KB)

One more point I recollected when I saw the doc/docx file with the message for the image. We have implemented “IResourceLoadingCallback” and are skipping the loading of external reference URLs. We want to skip loading of external reference images in order to avoid any security vulnerability.
Could this be the cause ?

@alexey.noskov
@tahir.manzoor
It looks like the implementation of “IResourceLoadingCallback” is responsible for the “red cross” image. As mentioned earlier, we want to avoid the loading of images coming from external resources / URL. This way we are avoiding any security vulnerabilities which were discovered during the penetration tests.
So we cannot skip this part because we set this as a common callback during initialization of the ASPOSE com.aspose.words.LoadOptions and the same is used for the loading of the Document class object.

When I read the mhtml, I could not understand the external reference to the inline image. All I could see is that the images are jpg format (and not tiff - this is the reason I tried to run this test before the JAI package).

It is a little strange and hence I would need your help to understand how to manage this part while converting the mhtml to doc and then to pdf.

@vedjaipraful Thank you for additional information. It looks like this problem is already resolved in the most recent 22.5 version (for .NET for now, Java will be available in a week or so). Now Aspose.Words does not invoke IResourceLoadingCallback for data URLs, like in your case. Please see the release notes for more information.
In your case, you can modify your IResourceLoadingCallback implementation and check whether the URL from where the resource is loaded is external. For example in MHTML produced by Aspose.Email images URLs starts from "cid:image". So you can use this to use the default ResourceLoadingAction:

private static class MyResourceLoadingCallback implements IResourceLoadingCallback
{
    @Override
    public int resourceLoading(ResourceLoadingArgs args) throws Exception {
        if(args.getOriginalUri().startsWith("cid:image"))
            return ResourceLoadingAction.DEFAULT;

        return ResourceLoadingAction.SKIP;
    }
}

@alexey.noskov
Thanks for this suggestion. It works with this check. Perhaps will solve my immediate problem
Also request you to update me when the new version with the fix for the data URLs is available for Aspose.Words for Java.

@tahir.manzoor - could you please help with this part ?
Additionally I had the other question about the attachments in the mail. As part of the original query I had asked if the images attached to the mail are supposed to be converted and be a part of the output to be saved as PDF? Because the documentation page said that the attachments will not be part of the PDF. With this below statement I believed that even images which are attached should not be part of the output PDF, but that is not true.

An email message can contain attachments as well. Since each attachment can be of different media type, Aspose.Email ignores these attachments while converting to MHTML i.e. only inline images in a message will be part of MHTML and any regular attachments will be ignored.