Word → PDF and also convert the embedded Excel files to PDF (Java)
The sample below shows how to:
Convert the main Word document to PDF with Aspose.Words.
Enumerate every embedded object (OLE/Package) in the Word file.
Detect the objects that are Excel workbooks (.xls / .xlsx).
Convert each Excel workbook to PDF with Aspose.Cells.
(Optional) Render the generated Excel‑PDF pages to images and insert them back into the Word document.
Prerequisites
Aspose.Words for Java ≥ 24.11
Aspose.Cells for Java ≥ 25.1
(Optional) Aspose.PDF for Java ≥ 23.10 – only needed if you want to embed the Excel‑PDF as an image.
Both JARs must be on the project classpath.
1. Complete code
import com.aspose.words.*;
import com.aspose.cells.*;
import java.io.*;
import java.util.Iterator;
public class WordWithExcelAttachmentsToPdf {
/** Convert a Word Document to PDF and return the PDF bytes. */
private static byte[] convertWordToPdf(Document wordDoc) throws Exception {
try (ByteArrayOutputStream bos = new ByteArrayOutputStream()) {
PdfSaveOptions pdfOpts = new PdfSaveOptions();
pdfOpts.setCompliance(PdfCompliance.PDF_17);
pdfOpts.setUpdateFields(false); // keep fields as‑is
wordDoc.save(bos, pdfOpts);
return bos.toByteArray();
}
}
/** Simple helper – true for .xls or .xlsx extensions. */
private static boolean isExcel(String ext) {
return ext != null && (ext.equalsIgnoreCase("xls") || ext.equalsIgnoreCase("xlsx"));
}
/** Convert an Excel byte array to PDF and return the PDF bytes. */
private static byte[] convertExcelToPdf(byte[] excelData) throws Exception {
try (ByteArrayInputStream bis = new ByteArrayInputStream(excelData);
ByteArrayOutputStream bos = new ByteArrayOutputStream()) {
// Aspose.Cells automatically detects the format (xls / xlsx)
Workbook wb = new Workbook(bis);
wb.save(bos, SaveFormat.PDF);
return bos.toByteArray();
}
}
public static void main(String[] args) throws Exception {
// -------------------------------------------------
// 1. Load the Word document (DOC or DOCX)
// -------------------------------------------------
Document wordDoc = new Document("input.docx");
// -------------------------------------------------
// 2. Convert the main document to PDF
// -------------------------------------------------
byte[] wordPdf = convertWordToPdf(wordDoc);
try (FileOutputStream fos = new FileOutputStream("output_word.pdf")) {
fos.write(wordPdf);
}
// -------------------------------------------------
// 3. Process embedded objects
// -------------------------------------------------
EmbeddedObjectCollection embedded = wordDoc.getEmbeddedObjects();
int excelIndex = 1; // used for naming the generated PDFs
// Use an iterator because we may want to remove objects while iterating
for (Iterator<EmbeddedObject> it = embedded.iterator(); it.hasNext(); ) {
EmbeddedObject eo = it.next();
// Raw bytes of the embedded file
byte[] data = eo.getEmbeddedData();
// Try to determine the type – extension is the most reliable
String ext = eo.getFileExtension(); // e.g. "xlsx"
OLEFileFormat format = eo.getFileFormat(); // e.g. OLEFileFormat.XLSX
if (isExcel(ext) ||
format == OLEFileFormat.XLSX ||
format == OLEFileFormat.XLS) {
// -------------------------------------------------
// 4. Excel → PDF
// -------------------------------------------------
byte[] excelPdf = convertExcelToPdf(data);
String pdfName = "attachment_excel_" + excelIndex + ".pdf";
try (FileOutputStream fos = new FileOutputStream(pdfName)) {
fos.write(excelPdf);
}
System.out.println("Excel attachment converted to PDF: " + pdfName);
// -------------------------------------------------
// 5. (Optional) Insert the PDF as an image back into the Word file
// -------------------------------------------------
// Uncomment the block below if you need a visual representation
/*
com.aspose.pdf.Document pdfDoc = new com.aspose.pdf.Document(
new ByteArrayInputStream(excelPdf));
com.aspose.pdf.devices.Resolution res = new com.aspose.pdf.devices.Resolution(300);
com.aspose.pdf.devices.PngDevice pngDev = new com.aspose.pdf.devices.PngDevice(res);
ByteArrayOutputStream imgOut = new ByteArrayOutputStream();
pngDev.process(pdfDoc.getPages().get_Item(1), imgOut);
DocumentBuilder builder = new DocumentBuilder(wordDoc);
builder.moveToDocumentEnd();
builder.insertImage(imgOut.toByteArray());
*/
excelIndex++;
} else {
// Save any other embedded file unchanged
String otherName = "attachment_" + eo.getName();
try (FileOutputStream fos = new FileOutputStream(otherName)) {
fos.write(data);
}
System.out.println("Non‑Excel attachment saved: " + otherName);
}
}
// -------------------------------------------------
// (Optional) Save the modified Word document if you inserted images
// -------------------------------------------------
// wordDoc.save("output_word_with_images.docx");
}
}
new Workbook(InputStream) + Workbook.save(OutputStream, SaveFormat.PDF)
Aspose.Cells automatically recognises the format.
Convert Word → PDF
Document.save(OutputStream, PdfSaveOptions)
PdfSaveOptions lets you set compliance, field updates, etc.
(Optional) Render PDF page to image
com.aspose.pdf.Document + PngDevice
Needed only if you want to embed a visual of the Excel‑PDF.
Insert image into Word
DocumentBuilder.insertImage(byte[])
Places the image at the current cursor position.
2. Frequently asked questions & tips
Issue
Solution
Embedded object appears as “Package”
EmbeddedObject.getFileExtension() still returns the correct extension. If it is empty, parse the extension from eo.getName() (e.g., Object1.xlsx).
Excel workbook is password‑protected
Use LoadOptions when creating the Workbook: LoadOptions lo = new LoadOptions(); lo.setPassword("pwd"); Workbook wb = new Workbook(new ByteArrayInputStream(data), lo);
Large number of attachments – performance
Create a single PdfSaveOptions instance and reuse it. Re‑using the same ByteArrayOutputStream (clearing it each iteration) also reduces GC pressure.
Remove the original OLE object after conversion
wordDoc.getEmbeddedObjects().remove(eo); – this shrinks the final Word file if you no longer need the original Excel object.
Combine all generated Excel PDFs into one file
After the loop, load each PDF with Aspose.PDF, append its pages to a master Document, and save the master PDF.
3. High‑level workflow
Word (DOC/DOCX)
│
├─► Aspose.Words → Word PDF (output_word.pdf)
│
└─► Enumerate EmbeddedObjectCollection
├─ Excel? → Aspose.Cells → Excel PDF (attachment_excel_*.pdf)
└─ Other → Save original binary
Convert: new Workbook(stream) → workbook.save(pdf, SaveFormat.PDF)
Persist: Write the PDF bytes wherever you need them; optionally render to an image and embed back into the Word document.
With the code above you can reliably convert a Word file to PDF and turn every embedded Excel workbook into its own PDF, all using Aspose libraries. If you need further customisation (e.g., merging all PDFs, preserving hyperlinks, or batch processing a folder), feel free to ask!