Issue with Doc conversion to PDF with Embedded Document

Hello

I am trying to convert doc file to PDF file.
Doc file being tested has embedded file attachment/objects like image, excel file, jar file and other possible files.
The converted PDF file has just the images of those file icons and the attachments are missing in the generated PDF file.
How to get those files and embed them in the PDF file ?

Thanks
Rahul

I tried to do use insertOleObject method of Document Builder also but it lets you embed the file only when you save it in word doc format. The moment one saves the document in the pdf format, the embedded content only icon is there in the pdf and it is not clickable.

@rahulgupta01,

To ensure a timely and accurate response, please ZIP and attach the following resources here for testing:

  • Your input Word document with embedded objects
  • Aspose.Words generated output document showing the undesired behavior
  • Your expected document which shows the correct output. Please create this document by using Microsoft Word application.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you code to achieve the same. Thanks for your cooperation.

Please find attached sample document having embedded document (Input docs file in zipped format) along with generated PDF file

.Sample_Pdf.pdf (216.5 KB)
Sample_Pdf.pdf (216.5 KB)
Sample_Test_Doc.zip (1.8 MB)

@rahulgupta01,

Please also ZIP and attach your expected PDF document which shows the correct output. You may please create this document by using Microsoft Word or any other suitable application. We will then investigate the structure of your expected document as to how you want your final output be generated like. Thanks for your cooperation.

Sample Output file should have clickable icons or text which should lead to opening of the embedded file.
One of the sample having icon to click the embedded file is attached.

Sample_Test_Doc.zip (1.8 MB)

@rahulgupta01,

Are you seeing wrong behavior in Aspose.Words generated PDF files? If yes, then the expected final output should be PDF, but you have again attached the Word file. Please ZIP and attach your expected PDF document which shows the desired output.

I am attaching the zip vrsion of the PDF file. Please refer page 1 bottom left for clickable icon example.

Sample_Doc_Out_Attach.zip (3.3 MB)

@rahulgupta01,

I understand what you are asking about. Please see these documents (Docs.zip (2.2 MB)).

But, when you convert this “Sample_Test_Doc.docx” to PDF by using MS Word 2016, it does also not preserve embedded objects as clickable. Can you achieve the same results by using MS Word? If yes, then please provide steps to do the same by using MS Word.

I know that MS word does not also preserve embedded documents when I save the doc as PDF. But we have a requirement that we need to ensure that embedded document are also available in the newly created PDF when we we convert our doc to PDF. So, kindly tell me the procedure as to how we can achieve this. You have the expected output PDF with you.

@rahulgupta01,

We have logged this requirement in our issue tracking system. The ID of this issue is WORDSNET-17087. We will further look into the details of this problem and will keep you updated on the status of this issue.

@rahulgupta01,

Regarding WORDSNET-17087, unfortunately the implementation of the fix of this issue has been postponed till a later date (currently no estimates are available). We will inform you via this thread as soon as this issue is resolved or any estimates are available. We apologize for any inconvenience.

As a workaround maybe it will be possible to embed attachments to Aspose.Words generated PDF output via third-party PDF editing tool (like Aspose.PDF for Java). You could probably get the attachment data and position from the Aspose.Words DOM and then create a File Attachment Annotation in PDF output with Aspose.PDF. If you are interested in such workaround, we probably could investigate it further and provide a code sample.

The issues you have found earlier (filed as WORDSNET-17087) have been fixed in this Aspose.Words for Java 22.11 update also available on Maven.