HTML containing hebrew to PDF

Hi,

While I am being able to convert other HTML files (which contain Hebrew text) correctly to PDF(the output is not gibberish), the PDF obtained when converting the attached HTML file contains gibberish text.

Can you please look into this document and advise?

Hi Ajesh,

Thanks for your inquiry.

While using the latest version of Aspose.Words i.e. 14.6.0, I managed to reproduce this issue on my side. I have logged this issue in our bug tracking system. The ID of this issue is WORDSNET-10460. Your thread has also been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Best regards,

Hi,
Can you please move this issue to Enterprise Support.

Thanks,
Ajesh

Hi Ajesh,

Regarding WORDSNET-10460, our development team has completed the work on your issue and has come to a conclusion that this issue and the undesired behaviour you’re observing is actually not a bug in Aspose.Words. So, we will close this issue as ‘Not a Bug’.

The HTML document is in the ‘Windows-1255’ encoding, but there are not indications of it in the document and no hints are given to Aspose.Words during import. As a result, Aspose.Words uses a different default encoding when reading the document’s contents.

You can use any of the following workarounds to import the document correctly:

  1. Specify what encoding to use during import.
Document doc = new Document("html.html", new LoadOptions { Encoding = Encoding.GetEncoding(1255) });
  1. Add a <meta> element with charset information into the document’s <head>.
<html>    
<head>        
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1255">        
    <title> מתניה כותרת</title>           
...
  1. Convert the document from ‘Windows-1255’ to Unicode (for example, to UTF-8 with Byte Order Mark (BOM)) in a text editor without changing the document’s contents and then load it as usual:
Document doc = new Document("html.html");

Best regards,

The issues you have found earlier (filed as WORDSNET-10460) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.