Creating PDF files from Word docs with embedded Word/Excel Files within the file

We have a directory of word docs that have embedded word docs/excel files within them. We are trying to render all of them to a pdf file with the embedded word/excel file as a pdf as well.

Is this something that Aspose Word can accomplish?

Thank you,

Lisa

Hi Lisa,


Thanks for your inquiry. Yes, Aspose.Words supports preserving OLE objects in documents. That is if you open an MS Word document and then save it (possibly in another MS Word format or PDF) then OLE objects are preserved. You can also access objects programmatically, extract their data and preview image.

If we can help you with anything else, please feel free to ask.

Best Regards,

So to confirm, when your product saves the word doc as a pdf - will the embedded document be a pdf or a document still? Thank you. Lisa

Hi Lisa,


Thanks for your inquiry. Please note that, the documents, i.e. embedded inside a main Word document, will not be converted to PDF upon converting the main document to PDF format. However, the contents (including the contents of embedded objects) of main document will surely be converted/rendered to PDF.

Please let me know if I can be of any further assistance.

Best Regards,

So when you said Word supports the OLE object in documents - what did you mean by this? It said OLE objects are preserved.

We are trying to create a pdf file of a Word doc that has a word doc within it. Are you confirming that the pdf would not preserve the embedded file so it could be launched from the new pdf.

I believe you are saying this isn't possible but wanted to confirm. Thank you. Lisa

Hi Lisa,


Thanks for your inquiry. First of all, please see the attached document. There are three embedded objects inside the attached document. Simple embedded document, embedded as link and embedded as icon (show icon). You can try using the following code to extract each of these embedded objects and convert them to PDF as well:

Document doc = new Document(@“c:\temp\test.docx”);

// Get collection of shapes
NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true);

int i = 0;
//Loop through all shapes
foreach (Shape shape in shapes)
{
if (shape.OleFormat != null)
{
if (!shape.OleFormat.IsLink)
{
//Extract OLE Word object
if (shape.OleFormat.ProgId == “Word.Document.12”)
{
MemoryStream stream = new
MemoryStream();
shape.OleFormat.Save(stream);

Document newDoc = new
Document(stream);
newDoc.Save(string.Format(@“C:\temp\outEmbeded_{0}.pdf”, i));

i++;
}

//Extract OLE Excel object
if (shape.OleFormat.ProgId == “Excel.Sheet.12”)
{
// Here you can use Aspose.Cells component
// to be able to convert MS Excel files to PDF
}
}
else
{
string filePath = shape.OleFormat.SourceFullName;

Document newDoc = new
Document(filePath);
newDoc.Save(string.Format(@“C:\temp\outLinkedEmbeded_{0}.pdf”,
i));

i++;
}
}
}

Secondly, when you convert the main document, in this case ‘test.docx’, directly to PDF format, the content of embedded objects inside will be rendered in PDF. There will be no attachments or links in the final PDF.
Please let me know if I can be of any further assistance.

Best Regards,

Hi Awais,

Can you provide the similar code for Java.

Thanks,

Pramod Talekar

Hi Pramod,


Thanks for your inquiry. Sure, please find the code in here. Please let me know if I can be of any further assistance.

Best Regards,