Re: PDF to RTF converter

I’ve spent a day trying to figure this out with out success (PDF to RTF)

I am using Aspose.Words 11.8.0.0 and Aspose.Pdf 7.7.0.0

Here is one of the code snippets.

//load the PDF
Aspose.Pdf.Document pdfDoc = new Aspose.Pdf.Document(path);

MemoryStream docStream = new MemoryStream();
Aspose.Pdf.DocSaveOptions saveOptions = new Aspose.Pdf.DocSaveOptions();
saveOptions.Mode = Aspose.Pdf.DocSaveOptions.RecognitionMode.Flow;

pdfDoc.Save(docStream,saveOptions);

//now load into a DOC for conversion to RTF
Aspose.Words.Document doc = new Aspose.Words.Document(docStream);

MemoryStream rtfStream = new MemoryStream();
doc.Save(rtfStream, Aspose.Words.SaveFormat.Rtf);

I currently get the following exception:

{Aspose.Words.FileCorruptedException: The document appears to be corrupted and cannot be loaded. —> System.ObjectDisposedException: Cannot access a closed Stream.
 at `System.IO`.__Error.StreamIsClosed()
 at System.IO.MemoryStream.get_Position()
 at Aspose.Words.Document.x5d4db34d48fb3129(Stream xcf18e5243f8d5fd3, LoadOptions x27aceb70372bde46)
 — End of inner exception stack trace —
 at Aspose.Words.Document.x5d4db34d48fb3129(Stream xcf18e5243f8d5fd3, LoadOptions x27aceb70372bde46)
 at Aspose.Words.Document.x5d95f5f98c940295(Stream xcf18e5243f8d5fd3, LoadOptions x27aceb70372bde46)
 at Aspose.Words.Document…ctor(Stream stream, LoadOptions loadOptions)
 at Aspose.Words.Document…ctor(Stream stream)*

I’ve tried several different ways of doing this including setting the stream position to 0. (which gives me a closed stream exception).
It always throws an exception creating the doc with the stream.

Suggestions?
Scott

Hi Scott,

Thanks for using our products.

I have tested the scenario using the code snippet that you have shared and I am able to notice the same problem that Aspose.Words.Document object is generating an exception when loading the output generated with Aspose.Pdf for .NET. However I have tried saving the output generated by Aspose.Pdf.Document object over my system and as per my observations, the resultant file is properly being generated. So the problem seems to be related to Aspose.Words for .NET. I am moving this thread to respective forum and I believe my fellow workers taking care of this product would be in better position to answer this query. Soon you will be updated with the status of correction.

The file generated with following code snippet is properly opening in MS Word 2010.

[C#]

// load the PDF
Aspose.Pdf.Document pdfDoc = new Aspose.Pdf.Document("c:/pdftest/ZoomValue.pdf");
// MemoryStream docStream = new MemoryStream();
FileStream fs = new FileStream("c:/pdftest/New_ConvertedFile.doc", FileMode.Create);
Aspose.Pdf.DocSaveOptions saveOptions = new Aspose.Pdf.DocSaveOptions();
saveOptions.Mode = Aspose.Pdf.DocSaveOptions.RecognitionMode.Flow;
// pdfDoc.Save(docStream, Aspose.Pdf.SaveFormat.Doc);
pdfDoc.Save(fs, saveOptions);
// pdfDoc.Save("c:/pdftest/ZoomValue.doc", Aspose.Pdf.SaveFormat.Doc);
fs.Close();

Thank you for moving this into the proper thread (Aspose.Words.Doc).

You are correct in saying that the Words is where I am having the issue.
One thing to note. With your example, you are saving things to the hard drive.
Unfortunately because of the nature of my business, saving to the hard drive is not an option. (HIPPA compliance for medical documentation.) So everything must be in memory as a stream.

Some of the different options that I have tried was to actually save the PDF->DOC conversion.
I could open up the document in MS Word. But not with the Aspose.Words. So still there is a possibility that Aspose.PDF is having an issue by not creating a well formed document.

I’m hoping the Aspose.Words will try this and see if this their issue.

Thanks
Scott

Hi Scott,

Thanks for your inquiry. I am a representative of Aspose.Words team.

Could you please save your DOC file to disk using Aspose.Pdf and attach the problematic Word document here for testing? I will investigate the issue on my side and provide you more information.

Best regards,

Well, this is going to get things bounced around between the Aspose.Words and Aspose.Pdf teams.

I’ve found the issue and have managed to code around the problem.

Aspose.Pdf.Document pdfDoc = new Aspose.Pdf.Document(stream);

Aspose.Pdf.DocSaveOptions saveOptions = new Aspose.Pdf.DocSaveOptions();
saveOptions.Mode = Aspose.Pdf.DocSaveOptions.RecognitionMode.Flow;

byte[] pdfBytes;
using(MemoryStream docStream = new MemoryStream())
{
    pdfDoc.Save(docStream, saveOptions);
    pdfBytes = docStream.GetBuffer();

}

Aspose.Words.Document doc;
using(MemoryStream docStream2 = new MemoryStream(pdfBytes))
{
    doc = new Aspose.Words.Document(docStream2);

}

using(MemoryStream rtfStream = new MemoryStream())
{
    doc.Save(rtfStream, Aspose.Words.SaveFormat.Rtf);

    rtfStream.Position = 0;
    rtfView.LoadDocument(rtfStream, DevExpress.XtraRichEdit.DocumentFormat.Rtf);
}

This will give me a stream that is an RTF.
The problem isnt with Aspose.words but its with Aspose.PDF, who bounced this thread over to Words. If you look at the code, its the use of the btye[] buffer that is the problem (workaround). Aspose.Pdf will not let you use the stream to create a new document. Aspose.Pdf closes the stream and it is unusable even to read from.

Hope this helps. BTW, this code will cause memory issues if you use it a lot.
One suggestion that I have for Aspose is that they start using the IDisposable for the using command so that objects are disposable.

Hi Scott,

Thanks for sharing the details.

I my earlier attempt, I have tried saving the output DOC file generated with Aspose.Pdf for .NET into FileStream object which eventually created the file over system. However in your first post, you have been trying to save the resultant DOC file into Stream Object (MemoryStream) and I am afraid Aspose.Pdf for .NET does not support the feature to save the resultant DOC file into Stream object. Nevertheless, the approach of saving DOC file contents into Byte array seems to be working and as far as it serves your purpose, you may continue using it.

We are sorry for this inconvenience.

The issues you have found earlier (filed as PDFNEWNET-33486) have been fixed in Aspose.Pdf for .NET 8.3.0.

This message was posted using Notification2Forum from Downloads module by Aspose Notifier.