DOC Produced by Aspose.Pdf Failes to Load Into Aspose.Words

<!–[if gte mso 10]> /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-fareast-language:EN-US;}

<![endif]–>

Hello,
I am attempting to convert a PDF document to a MHTML document using Aspose.Pdf and Aspose.Words. Aspose.Pdf opens the PDF document successfully and saves the PDF to a DOC file successfully but the resulting DOC file fails to load in Aspose.Words.

Code:
MemoryStream docStream = new MemoryStream();

Aspose.Pdf.Document pdfDoc = new Aspose.Pdf.Document(pdfDocPath);
pdfDoc.Save(docStream, Aspose.Pdf.SaveFormat.Doc);

Aspose.Words.Document wordDoc = new Aspose.Words.Document(docStream);
wordDoc.Save(outputPath, Aspose.Words.SaveFormat.Mhtml);

When docStream is passed into Aspose.Words.Document() is produces a "The document appears to be corrupted and cannot be loaded" error.

I have attached the PDF document I have been working with.

Thanks.

Aspose.Pdf Version: 6.2.0.0
Aspose.Words Version: 10.4.0.0

Hi Jordan,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for sharing the sample code and template file.

We have found your mentioned issue after an initial test. Your issue has been registered in our issue tracking system with issue id:PDFNEWNET-30768. You will be notified via this thread regarding any update against your reported issue.

Sorry for the inconvenience caused,

I have tried to perform the same operation to get around an issue with your Aspose.PDF component increasing the document size when using the replace. So I thought I would try to convert to word doc perform the replace and then convert back to PDF for final manipulation.


I don’t have the option to write the file to disk and then reopen.

The question is why do you close the stream when a save is performed, this is wrong as you should not modify the state of a passed stream it is up to the caller to manage their resources, not a component which is simply populating them.


Previously I had assumed that you were doing something with he memory stream. I just tried a couple of other things and it seem that any word document I write with Aspose.PDF --> Word Doc, can not be opened in Aspose.Words yet any doc I create with MsWord Can…

pdfDocument.Save("d:\\pdf\\wordtest.doc", SaveFormat.Doc);
wordDocument = new Aspose.Words.Document("d:\\pdf\\wordtest.doc");

This fails with above mentioned error.

var ms = new MemoryStream();
pdfDocument.Save(ms , SaveFormat.Doc);
wordDocument = new Aspose.Words.Document(ms );

This fails with above mentioned error

All documents created outside Aspose.PDF can be opened by Aspose.Words.

This indicates an issue with the Aspose.PDF save to word doc file.

As I was trying to do this as a work around to another issue (https://forum.aspose.com/t/104549#331708) with the PDF component. I may need to start reviewing other Document Manipulation Components, as I simply don’t have a viable solution at the moment using yours.

Hi Jordan,

Thanks for your patience.

Our development team is working hard to get this issue fixed but I am afraid its not yet completely resolved. However I have requested the team to share the ETA regarding its resolution. Please be patient and spare us little time. We apologize for your inconvenience.

Hello, could we please get an update as to the progress on this issue?

Thanks.

Hi Jordan,

Thanks for you patience.

We have further investigated this problem and have observed that the resultant DOC file generated with Aspose.Pdf for .NET can be viewed in MS Office words without any problem whereas the problem is occurring when reading its contents using Aspose.Words for .NET. I think Aspose.Words team also needs to look into this matter. For that reason, I am moving this thread to respective forum where I believe my fellow workers taking care of this product would be in position to share their thoughts. Thanks for your cooperation and comprehension in this regard.

Hi Jordan,

Thank you for reporting this problem to us. I am representative of Aspose.Words team. I managed to reproduce the problem on my side and logged the issue in our defect database. We will let you know once it is resolved.

Best regards,

Hello,
Could I please get an update on the progress of this issue?

Thanks,
Jordan

Hi Jordon,


Thanks for your inquiry.

Good news, the issue has been resolved in the current code base and will be included in the next release of Aspose.Words. We will inform you as soon as this is released.

Thanks,

The issues you have found earlier (filed as WORDSNET-5326) have been fixed in this .NET update and in this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(6)

Hi Jordan,


Thanks for your patience.

We have further investigated the issue PDFNEWNET-30763 and I am unable to notice any problem when using Aspose.Pdf for .NET 6.7.0 and Aspose.Words for .NET 11.1.0 and I am unable to notice any issue. The MS-Document generated with Aspose.Pdf for .NET is properly being opened with Aspose.Words for .NET. Please try using the latest release versions and in case you still face any issue or you have any further query, please feel free to contact.

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan