Getting attachment files inside word document

Is there a way to deal with the attachments files inside a word document ?

Best Regards!

Hi Anas,

Thanks for your inquiry. Could you please attach your sample Word document with attachment inside here for testing? What do you want to do with attachments e.g. do you want to extract content of attached files? Please provide complete details of your usecase. We will investigate the scenario on our end and provide you more information.

Best regards,

Dear there ,

I attached two sample files one of them is .doc and the other one is .docx, I need to convert both of them like the following : open that file , check if that file has an attachment word file inside it ,if so; then convert the inner file and append it at the end of the parent document.

Best Regards!

Hi Anas,

Thanks for your inquiry. Please use the following code to load embedded document and append the content of it at the end of main document:

Document doc = new Document(getMyDir() + "sample2.doc");

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);

for (Shape shape : (Iterable) shapes)
{
    if (shape.getOleFormat() != null)
    {
        if (!shape.getOleFormat().isLink())
        {
            if (shape.getOleFormat().getProgId().equals("Word.Document.12"))
            {
                ByteArrayOutputStream baos = new ByteArrayOutputStream();
                shape.getOleFormat().save(baos);

                InputStream inputStream = new ByteArrayInputStream(baos.toByteArray());

                Document newDoc = new Document(inputStream);

                doc.appendDocument(newDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
                break;
            }
        }
    }
}

doc.save(getMyDir() + "16.2.0.doc");

Hope, this helps.

Best regards,

I need to get all the files under Document file.
I only get the first one even if I have more than one inside the parent document.

Here is a sample that contains multiple word file inside the parent one .

Best Regards!

Hi Anas,

Thanks for your inquiry. You just need to remove the break; statement inside the if (shape.getOleFormat().getProgId().equals(“Word.Document.12”)) block. Hope, this helps.

Best regards,

Dear there,
Hope everything is being good, then;
I have a multiple questions here: what’s the meaning of “Word.Document.12” ?
and I want to append all office documents like excel , powerpoint and others ? Is there any standard to read those office file; I mean rather than word document ?

Best Regards!

Hi Anas,

Thanks for your inquiry. “Word.Document.12” is “programmatic identifier” which is identified by ProgId property. Secondly, I think, you can convert excel, powerpoint documents to PDF using Aspose.Cells and Aspose.Slides APIs.
https://products.aspose.com/total/

Once all office files are converted to individual Pdf files, you can use Aspose.Pdf to concatenate all Pdf files into one big PDF.
https://docs.aspose.com/pdf/java/concatenate-pdf-documents/

You can also convert final PDF to Word format using Aspose.Pdf for Java. Hope, this helps.

Best regards,

Okay,Sounds good.

But is there anyway to determine if the attached file inside a word document is excel,powerpoint,visio or etc…?
I mean if there is a way to recognize them like word was “Word.Document.12” ?

Best Regards!

Hi Anas,

Thanks for your inquiry. You can use the following code to determine ProgId as follows:

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);

for (Shape shape : (Iterable) shapes)
{
    if (shape.getOleFormat() != null)
    {
        System.out.println(shape.getOleFormat().getProgId());
    }
}

Best regards,