Free Support Forum - aspose.com

Convert Hebrew HTML to PDF

Hi,


I am using Aspose.Word to convert HTML with hebrew fonts to PDF.
There are 2 issues:
  1. The PDF contains 4 pages instead of 1 - Tables format are not looking good in the PDF
  2. The Hebrew fonts dosn’t encoding correctly
Please assist

Attached
  1. HTML file
  2. PDF with the Issue
  3. Word Temlpate using for converstion

Thx
Yaniv
Hi Yaniv,

Thanks for your inquiry. We have tested the scenario using following code example and have managed to reproduce the same issues at our side.

Document document = new Document(MyDir + "template.docx");

DocumentBuilder builder = new DocumentBuilder(document);

builder.InsertHtml(File.ReadAllText(MyDir + "pathology.HTML"));

document.Save(MyDir + "17.3.0.pdf");


For the sake of correction, we have logged these problems in our issue tracking system as follow:

WORDSNET-15150 : Hebrew text doesn't encoding correctly in output Docx/Pdf
WORDSNET-15151 : Table formatting is changed in output Docx/Pdf

You will be notified via this forum thread once these issues are resolved. We apologize for your inconvenience. As a workaround of this issue, please use following code example. Hope this helps you.

Document document = new Document(MyDir + "template.docx");

Document html = new Document(MyDir + "pathology.HTML");

DocumentBuilder builder = new DocumentBuilder(document);

builder.InsertDocument(html, ImportFormatMode.KeepSourceFormatting);

document.Save(MyDir + "17.3.0.pdf");

Hi Tahir,


The workaround is working good.
However there are 2 issues with this solution
  1. The PDF doesn’t include the header and footer which include in the template.docx
  2. Last PDF page is always empty page
Please Assist

Thx
Yaniv

Hi Yaniv,

Thanks for your inquiry. The issues WORDSNET-15151 and WORDSNET-15151 are not bugs. Please use Encoding.GetEncoding(1255) method as shown below to fix the issues.

builder.InsertHtml(File.ReadAllText(MyDir + "pathology.HTML", Encoding.GetEncoding(1255)));


Your input document contains styles and reflected in output document. E.g. the paper size of template Word document and HTML is different. In your case, we suggest you please use DocumentBuilder.InsertDocument method to insert HTML document into Word document.

Please note that Aspose.Words mimics the same behavior as MS Word does. The HTML content flows to next page when it is inserted into Word document.
yaniv.ayalon:
However there are 2 issues with this solution
The PDF doesn't include the header and footer which include in the template.docx
Last PDF page is always empty page
Please use latest version of Aspose.Words for .NET 17.4. If you still face problem, please attach the following resources here for testing:

  • Please attach the output PDF file that shows the undesired behavior.
  • Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we'll start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip them and Click 'Reply' button that will bring you to the 'reply page' and there at the bottom you can include any attachments with that post by clicking the 'Add/Update' button.