Free Support Forum - aspose.com

FileCorruptedException when creating Document object from ByteArray

Hi,

Im running into an issue when I try to create a doucment object using a byte stream. Is there any particular format (like UTF-8 or UTF-16) that the Aspose Word requires the input stream to be when creating the Document object ?

Can you please help me out here ? Unfortunately I cannot attach the byte stream here …

the error i’m getting is as follows ::

The document appears to be corrupted and cannot be loaded.
com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.
at com.aspose.words.FileFormatUtil.a(FileFormatUtil.java:98)
at com.aspose.words.Document.b(Document.java:1215)
at com.aspose.words.Document.a(Document.java:1084)
at com.aspose.words.Document.(Document.java:194)
at com.aspose.words.Document.(Document.java:165)
at com.aspose.words.Document.(Document.java:160)
at aspose.words.DocumentUtils.createDocument(DocumentUtils.java:301)
at …
Caused by: java.lang.IllegalStateException: Invalid hex string.
at asposewobfuscated.ke.ae(PalFormatter.java:368)
at com.aspose.words.abz.hn(NrxXmlUtil.java:203)
at com.aspose.words.avz.a(WmlStylesReader.java:218)
at com.aspose.words.avz.a(WmlStylesReader.java:45)
at com.aspose.words.avt.read(WmlReader.java:64)
at com.aspose.words.Document.b(Document.java:1151)

Hi

Thanks for your request. Unfortunately, it is difficult to say what the problem is without the document. I need this document to reproduce the problem on my side.

It is safe to attach files in the forum. If you attach your document here, only you and Aspose staff members can download it.

Best regards,

ok … I managed to save the bytestream into a doc file … attaching the doc file here …

Hi

Thank you for reporting this problem to us. I managed to reproduce the problem on my side. Your request has been linked to the appropriate issue. You will be notified as soon as it is resolved.

As a temporary workaround, you can open/save the file using MS Word. Then Aspose.Words is able to process the file.

Best regards,

Opening / saving the file using MS word is not an option, since this is a byte-stream that is derived into a linux server for processing, streamed directly from a webservice.


Can you please let me know if there is an estimated time of resolution for this issue ? We do have licenses for Aspose.Total. Since this is a bit time critical, can you please let us know how to increase the priority of this issue or perhaps escalate it ?

Thank you !

Hi

Thanks for your request. Unfortunately, I cannot provide you an estimate at the moment. Our developers will analyze the problem and then I will be able to provide you more information regarding the issue. We will keep you informed regarding status of the issue and let you know once it is resolved.

Best regards,

Hi Alexey,

Any updates/resolutions on this issue yet?

Hi

Thanks for your request. Unfortunately, the problem is not yet resolved. The problem occurs because rsid of one of styles in your WML document has not proper hex value:

<w:style w:styleId="cvUMParent" w:type="paragraph">

<w:name w:val="cvUMParent" />

<w:rsid w:val="00F73CVUM" />

<w:rPr>

<w:rFonts w:ascii="Arial" w:h-ansi="Arial" />

<wx:font wx:val="Arial" />

<w:lang w:bidi="AR-SA" w:fareast="EN-US" w:val="EN-US" />

</w:rPr>

</w:style>

This violates Microsoft Office 2003 XML specification. That is why Aspose.Words throws an exception.

Best regards,

Hi Alexey,

I managed to change the <w:rsid w:val=00F73CVUM /> value at the source itself. Now the document bytes get created similarly except, instead of the above snippet, it shows <w:rsid w:val=00F73CAB />. I believe this is a proper 4-byte rsid value in hex format.

However when I try to create the Aspose Document object, i still run into errors with UTF charsets. Am attaching the error stack trace below.

Not sure why we are getting this UTF-7 charset error; the document bytes are supposed to be a UTF-8 byte stream.

com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.
at com.aspose.words.FileFormatUtil.a(FileFormatUtil.java:98)
at com.aspose.words.Document.b(Document.java:1215)
at com.aspose.words.Document.a(Document.java:1084)
at com.aspose.words.Document.(Document.java:194)
at com.aspose.words.Document.(Document.java:165)
at com.aspose.words.Document.(Document.java:160)
at com.wellsfargo.ws.creditview.services.aspose.words.DocumentUtils.createDocument(DocumentUtils.java:111)
at com.wellsfargo.ws.creditview.services.aspose.words.DocumentUtils.convertToPDF(DocumentUtils.java:82)
at com.wellsfargo.ws.creditview.services.ecf.handler.ECFCompileHandler.compileECFReport(ECFCompileHandler.java:229)
at com.wellsfargo.ws.creditview.services.ecf.mdb.ECFCompileMDB.onMessage(ECFCompileMDB.java:116)
at weblogic.ejb.container.internal.MDListener.execute(MDListener.java:466)
at weblogic.ejb.container.internal.MDListener.transactionalOnMessage(MDListener.java:371)
at weblogic.ejb.container.internal.MDListener.onMessage(MDListener.java:327)
at weblogic.jms.client.JMSSession.onMessage(JMSSession.java:4123)
at weblogic.jms.client.JMSSession.execute(JMSSession.java:4013)
at weblogic.jms.client.JMSSession$UseForRunnable.run(JMSSession.java:4541)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:464)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:200)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:172)
Caused by: java.lang.IllegalStateException: java.nio.charset.UnsupportedCharsetException: UTF-7
at asposewobfuscated.rv.pl(Encoding.java:506)
at asposewobfuscated.rv.pi(Encoding.java:468)
at asposewobfuscated.rv.a(Encoding.java:452)
at asposewobfuscated.rv.m(Encoding.java:392)
at com.aspose.words.pa.a(FileFormatDetector.java:318)
at com.aspose.words.pa.aN(FileFormatDetector.java:304)
at com.aspose.words.pa.A(FileFormatDetector.java:44)
at com.aspose.words.Document.b(Document.java:1108)
… 17 more
Caused by: java.nio.charset.UnsupportedCharsetException: UTF-7
at java.nio.charset.Charset.forName(Charset.java:499)
at asposewobfuscated.rv.pl(Encoding.java:502)
… 24 more


Hi

Thanks for your request. Could you please attach your changed document here for testing?

Also, the easiest way to work the problem around is open/save your document in MS Word. Ms Word resolves incorrect ids in the document and it works fine.

Best regards,

Hi,

Appreciate your quick response. I have attached the document here.
From what I have read of the RSID values, these are nothing more than unique time-based identifiers for some sort of internal style-sheet versioning specific to MS Word and these get rewritten based on style changes.Even if they are omitted, it really doesn’t matter.
However, i do not believe the document charset issue is related to this.
Besides opening/saving the document in MS Word is not an option since the document conversions are done in a linux app server with the data directly piped from the DB.

Hi

Thank you for additional information. I can successfully open/save the document you have attached. I used the latest version of Aspose.Words for testing. You can download it from here:

http://www.aspose.com/community/files/72/java-components/aspose.words-for-java/category1378.aspx

Best regards,

The issues you have found earlier (filed as WORDSNET-5028) have been fixed in this .NET update and in this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.