Email with Attachments conversion to PDF

Hi. I have an evalation licence for Aspose Total. I have the following requirement:

Convert an Email with Attachments to PDF; the attachments can be a number of different formats, but in my current scenario, it is a PDF document.

I have tried the following approach:

  1. Convert the Email to bytes
var input = new java.io.ByteArrayInputStream(bytes);
  1. Create a Msg object
var msg = new com.aspose.email.MailMessage();
msg = msg.load(input);
  1. Create PDF from Msg
var output = new java.io.ByteArrayOutputStream();
msg.save("HTMLOutput.html", com.aspose.email.SaveOptions.getDefaultMhtml());
var pdfDoc = new com.aspose.words.Document("HTMLOutput.html");
  1. Create a MapiEmail message
var outlookMessageFile = new com.aspose.email.MapiMessage();
input = new java.io.ByteArrayInputStream(bytes);
outlookMessageFile = outlookMessageFile.fromMailMessage(input);
  1. Get the attachments and save to Disc
attachments = outlookMessageFile.getAttachments();

for (var i = 0; i < outlookMessageFile.getAttachments().size(); i++)
{      	
	var outlookMessageAttachment =  outlookMessageFile.getAttachments().get_Item(i);
    outlookMessageAttachment.save(FILE_PATH_OUT + outlookMessageAttachment.getDisplayName());
}
  1. Save the PDF Document
var path = FILE_PATH_OUT + 'test.pdf';
pdfDoc.save(path);

However, when I try to add the PDF attachment to the PDF Document, I get an error:

Here is the code:

var fileSpecification = new com.aspose.pdf.FileSpecification(FILE_PATH_OUT + outlookMessageAttachment.getDisplayName());
pdfDoc.getEmbeddedFiles().add(fileSpecification);

Here is the error:

TypeError: Cannot find function getEmbeddedFiles in object Document.

I can’t see a getEmbeddedFiles method in com.aspose.words.Document, but I can see one in com.aspose.pdf.Document

But when I modify the package name to use .pdf instead of .word, I get a different error:

JavaException: com.aspose.pdf.exceptions.InvalidPdfFileFormatException: Incorrect file header

I think this error relates to the following line of code:

aspose.pdf.Document("HTMLOutput.html");

Can you please help?

@swy In your scenario pdfDoc is com.aspose.words.Document object and it does not have getEmbeddedFiles() property. In your case you can put attachments into the document as an embedded OLE objects using Aspose.Words. Please see our documentation for more information:
https://docs.aspose.com/words/java/working-with-ole-objects/

Thanks for the info. I have had a read of the sample code and consulted the APIs. I have added the DocumentBuilder and the following line of code:

documentBuilder.insertOleObject(fileName, true, true, null);

I now see a pdf icon in the output pdf file, but double clicking it does not open the attachment - how can I get this to work with the evaluation licence?

@swy You should specify PdfSaveOptions.EmbedAttachments property to preserve OLE objects in output PDF.

Hi. I have tried this settings, but double clicking the pdf icon still does nothing.

I have attached the input and output.
MSG Attachment 2.zip (2.7 MB)

Here is my code:

function convertMSGAttachmentsToPDF() {
   
   var bytes = qie.readFile(channelCache.getValue('FILE_PATH_IN') + channelCache.getValue('FILE_NAME'));
   var encodedBytes = qie.base64EncodeBytes(bytes);
   
   var attachments;

   try{
      var license = new com.aspose.words.License();
      license.setLicense("Aspose.TotalProductFamily.lic");
      
      var input = new java.io.ByteArrayInputStream(bytes);
      var msg = new com.aspose.email.MailMessage();
      msg = msg.load(input);
      
      var output = new java.io.ByteArrayOutputStream();

      msg.save("HTMLOutput.html", com.aspose.email.SaveOptions.getDefaultMhtml());
      
      var pdfDoc = new com.aspose.words.Document("HTMLOutput.html");
      var documentBuilder = new com.aspose.words.DocumentBuilder(pdfDoc);
      
      var outlookMessageFile = new com.aspose.email.MapiMessage();
      input = new java.io.ByteArrayInputStream(bytes);
      outlookMessageFile = outlookMessageFile.fromMailMessage(input);


      attachments = outlookMessageFile.getAttachments();
      
      for (var i = 0; i < outlookMessageFile.getAttachments().size(); i++) {
      	
      	var outlookMessageAttachment =  outlookMessageFile.getAttachments().get_Item(i);
      	outlookMessageAttachment.save(channelCache.getValue('FILE_PATH_OUT') + outlookMessageAttachment.getDisplayName());
      	
      	var fileName = new java.lang.String(channelCache.getValue('FILE_PATH_OUT') + outlookMessageAttachment.getDisplayName());
      	var fileNamePDFIcon = new java.lang.String(channelCache.getValue('FILE_PATH_OUT') + 'pdficon.png');
      	
      	try{
      	   documentBuilder.insertOleObject(fileName, true, true, null);
      	}catch(Exception)
      	{
      	   qie.info("Insert OLE Error 1: " + Exception);
      	}
      	      	
      }
      
      
      var path = FILE_PATH_OUT + 'test.pdf';
      
      var PdfSaveOptions = new com.aspose.words.PdfSaveOptions();
      PdfSaveOptions.setSaveFormat(com.aspose.words.SaveFormat.PDF);
      PdfSaveOptions.setEmbedAttachments(true);
      
      pdfDoc.save(path, PdfSaveOptions);
      
   }catch(Exception)
   {
      qie.info("Error: " + Exception);
   }
}

@swy I have simplified your code a bit and now it works as expected on my side:

com.aspose.email.MailMessage msg = new com.aspose.email.MailMessage();
msg = msg.load("C:\\Temp\\in.msg");
msg.save("C:\\Temp\\tmp.mhtml", com.aspose.email.SaveOptions.getDefaultMhtml());
com.aspose.words.Document doc = new com.aspose.words.Document("C:\\Temp\\tmp.mhtml");
com.aspose.words.DocumentBuilder builder = new com.aspose.words.DocumentBuilder(doc);

for (int i = 0; i < msg.getAttachments().size(); i++)
{
    com.aspose.email.Attachment attachment = msg.getAttachments().get_Item(i);
    String attachmentFileName = "C:\\Temp\\" + attachment.getName();
    attachment.save(attachmentFileName);
    builder.insertOleObject(attachmentFileName, false, true, null);
}

com.aspose.words.PdfSaveOptions opt = new com.aspose.words.PdfSaveOptions();
opt.setEmbedAttachments(true);
doc.save("C:\\Temp\\out.pdf", opt);

out.pdf (2.7 MB)

Hi. This is now working - thank you so much for your help.

One final question for you: I have created another input email that contains 6 attachments; the output only seems to contain the first 3 attachments as embedded documents.

If I remove the first 3 attachments from the input email, the remaining 3 attachments do appear in the output pdf as embedded documents.

Is there a limit on the number of attachments that can be processed / embedded?

@swy Could you please attach the problematic MSG file here for testing? We will check the issue and provide you more information.

Hi. I have attached a zip containing
MSG Attachment 5.zip (5.6 MB)

the file

@swy Thank you for additional information. As I can see msg.getAttachments().size() returns 3, so my colleagues from Aspose.Email team should take a look at the issue. They will get back to you shortly.

Hello @swy,

Aspose.Email has limitations on the number of attachments for the evaluation version. If you apply a license, there will be no such restrictions.

I understand, thanks for the info.

Do you have any sample code that shows how to insert the attachments as pages into the PDF document, rather than embedding as attachments?

Would they have to be converted into images first?

@swy It depends on the attachment format. If the attachment is MS Word document and you need to add it’s content to the end of the generated PDF, you can simply append it to the document:
https://docs.aspose.com/words/java/insert-and-append-documents/

Hi, I have been testing various Aspose libraries using the Evaluation licence.

I am now able to successfully convert the following file types to PDF:

  1. XLS, XLSX, XSLSM, CSV (using Aspose.Cells)
  2. PPT, PPTX, PPTM (using Aspose.Slides)
  3. MSG (using Aspose.Email)

My next task is to Convert an Email Message with Attachments of the above types into a PDF Document using Aspose.Words.

I am successfully able to embed attachments into a PDF using insertOleObject(). However, the project team tells me that they can’t be embedded and have to be converted into PDF pages and inserted into a single PDF with the other attachments.

I have therefore attempted to use the insertDocument() method instead. However, I am facing some issues:

  1. Publisher File

I can’t get Aspose.Words to insert a Publisher file as a document Page(s) into a PDF Document; it just seems to insert a large amount of Junk characters.

  1. Excel Files

I can’t get Aspose.Words to insert an Excel file as a document Page(s) into a PDF Document; I get the error:

Can't find method com.aspose.words.DocumentBuilder.insertDocument(com.aspose.cells.Workbook,number).
  1. Powerpoint Files

I can’t get Aspose.Words to insert a Powerpoint file as a document Page(s) into a PDF Document; I get the error:

Can't find method com.aspose.words.DocumentBuilder.insertDocument(com.aspose.slides.Presentation,number)
  1. PDF Files

Aspose.Word doesn’t seem to support inserting a PDF Document as document Page(s) into another PDF Document; I get the error:

com.aspose.words.UnsupportedFileFormatException: Pdf format is not supported on this platform. Use .NET Standard or .NET 4.6.1 version of Aspose.Words for loading Pdf documents.

I have attached a zip file containing my code and the input message with 3 attachments

functionConvertMessage.zip (2.1 MB)

@swy com.aspose.words.DocumentBuilder.insertDocument accepts only com.aspose.words.Document instances as an input. Please see our documentation to learn what documents formats are supported by Aspose.Words:
https://docs.aspose.com/words/java/supported-document-formats/

So I am afraid there is no direct method to insert content of Publisher, Excel, Powerpoint and PDF documents using Aspose.Words. You should first convert these document to supported format, i.e. to MS Word document.

@swy,
Please note that with Aspose.Slides for Java, you can convert PowerPoint presentations to PDF documents like this:

var presentation = new Presentation(inputStream);
presentation.save("output.pdf", SaveFormat.Pdf);
presentation.dispose();

@swy,

By the way, Aspose.Cells now supports to render Excel spreadsheets to DOCX directly, so may be you could render files (e.g., XLS, XLSX, XSLSM, CSV) to DOCX format via Aspose.Cells for Java and then use Aspose.Words to insert them as Document.