We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

UnsupportedCharsetException: UTF-7

Hi,

We are using facing issue while creating document from html content in some cases. We get following error message during html to doc conversion:

Caused by: java.lang.IllegalStateException: java.nio.charset.UnsupportedCharsetException: UTF-7
at asposewobfuscated.dj.fB(Encoding.java:482)
at asposewobfuscated.dj.fx(Encoding.java:241)
at asposewobfuscated.dj.k(Encoding.java:460)
at com.aspose.words.ed.fn(FileFormatDetector.java:246)
at com.aspose.words.ed.d(FileFormatDetector.java:38)
at com.aspose.words.Document.b(Document.java:1245)
… 53 more
Caused by: java.nio.charset.UnsupportedCharsetException: UTF-7
at java.nio.charset.Charset.forName(Charset.java:486)
at asposewobfuscated.dj.fB(Encoding.java:478)

We are using following method to create the document from HTML input stream, where input stream is created from String content.
String strRTEContent = “”;
byteInStream = new ByteArrayInputStream(strRTEContent.getBytes());
bufInStream = new BufferedInputStream(byteInStream);
doc = new Document(bufInStream, null, LoadFormat.HTML, “”);

we set the base URI null initially and then later we construct the html images and embed in document.This is working fine in 99% cases. But we have found above error in one odd cases.

Please help us on above issue as we are not able to figure out the root cause of this issue.

Thanks & Best Regarsd,
Sanjay Singh

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. Could you please provide me HTML string, which causes the problem? I will check the issue on my side and provide you more information.

Best regards.

Hi Alex,



Please find the attached HTML string which is causing the problem. I
have printed the String on console and saved the content in .txt file. Please let me know if you need any further details.

Thanks & Regards,
Sanjay Singh

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for additional information. I cannot reproduce the problem as you reported, but I reproduced another issue. Here is my test code:

// Reate HTML string from file.

File file = new File("C:\\Temp\\SampleHTMLText.htm");

StringBuffer contents = new StringBuffer();

BufferedReader reader = new BufferedReader(new FileReader(file));

String text = null;

while ((text = reader.readLine()) != null)

contents.append(text).append(System.getProperty("line.separator"));

reader.close();

String strRTEContent = contents.toString();

ByteArrayInputStream byteInStream = new ByteArrayInputStream(Charset.forName("UTF8").encode(strRTEContent).array());

Document doc = new Document(byteInStream, null, LoadFormat.HTML, "");

doc.save("C:\\Temp\\out.doc");

In your HTML there is PageBreak within a table, which is invalid. Aspose.Words throws an exception upon loading such HTML. Here is simplified HTML:

<html>

<body>

<table>

<tbody>

<tr>

<td>

<span>

<br style="page-break-before: always;" />

</span>

</td>

</tr>

</tbody>

</table>

</body>

</html>

Your request has been linked to the appropriate issue. You will be notified as soon as it is resolved.

Best regards.

The issues you have found earlier (filed as 15927) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Hi Alexey,


I’m having the same problem as the other users in this thread. I’ve attached a zip file containing the file “input.html” which in turn contains the String I’m using to create a Document object (using an ByteArrayInputStream just like the other examples). I get the same exception message: java.nio.charset.UnsupportedCharsetException: UTF-7

Any suggestions? My client has premium support and is tracking down the info for it, but I thought I’d post here in the meantime.

Thanks,
Matt

Hi Matthew,

Thanks for your inquiry.

I was unable to reproduce the issue using the default constructor and the code Alexey used above. Can you please make sure you are using the constructor like this Document doc = new Document(byteInStream, null, LoadFormat.HTML, ""); and see if the issue still remains.

Could you please provide some further details about your set up on your side? What version of Aspose.Words and Java are you using? What OS do you use? Is your application deployed as a web service?

Thanks,