Convert HTML to PDF using Aspose.PDF for .NET - Issues related to fonts

uaprogrammer · June 26, 2018, 1:22pm

Hi,

During html to pdf conversion we’ve faced with an strange aspose behavior:

If we use such code
var pdf = new Aspose.Pdf.Document(inputFilePath);
we have
Aspose.Pdf.InvalidPdfFileFormatException: Startxref not found
exception
if we use such code (encoding from file)
var htmlLoadOptions = new Aspose.Pdf.HtmlLoadOptions { InputEncoding = "utf-8" };
var pdf = new Aspose.Pdf.Document(inputFilePath, htmlLoadOptions);
we have
Aspose.Pdf.FontNotFoundException: Font Mangal was not found
exception
If we use such code
var htmlLoadOptions = new Aspose.Pdf.HtmlLoadOptions { InputEncoding = "iso-8859-1" };
var pdf = new Aspose.Pdf.Document(inputFilePath, htmlLoadOptions);
constructor passed succesfully and conversion performed using
pdf.Convert() and pdf.Save() methods but result pdf file has incorrect encoding

We think that it is an issue in Aspose.Pdf as it cannot create Document class with correct encoding or without specifying it and work only for incorrect encoding.

result.pdf (1.2 MB)
Testfile_2_Sverige – Wikipedia.zip (99.7 KB)

asad.ali · June 26, 2018, 7:43pm

@uaprogrammer

Thanks for contacting support.

In this case, Document constructor expects a PDF document and you were passing a HTML document to the constructor. Without specifying any load options, Document constructor will take document as PDF by default.

The source HTML document involves usage of specific fonts i.e. Mangal and Gautami. In order to obtain correct conversion results, you need to install these fonts in your environment. After installing those fonts in our environment and using following code snippet, we were able to generate correct output:

Aspose.Pdf.HtmlLoadOptions objLoadOptions = new Aspose.Pdf.HtmlLoadOptions(dataDir);
objLoadOptions.PageInfo.Margin.Bottom = 0;
objLoadOptions.PageInfo.Margin.Top = 0;
objLoadOptions.PageInfo.Margin.Right = 0;
objLoadOptions.PageInfo.Margin.Left = 0;
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(dataDir + "Testfile_2_Sverige – Wikipedia.html", objLoadOptions);
doc.Save(dataDir + "SamplefromHtml.pdf");

SamplefromHtml.pdf (1.6 MB)

You do not need to change encoding, but install required fonts in your machine/device, so that API can convert HTML into PDF correctly. Please try again with latest version Aspose.PDF for .NET 18.6, after installing specific fonts and if issue still persists, feel free to let us know.

uaprogrammer · June 27, 2018, 6:58am

Ok, i will retry with installed fonts but I still have a question - why Aspose does not require these fonts if we pass ‘iso-8859-1’ encoding? In such case it should throw the same exception like with ‘utf-8’ because fonts still missed? It is a aspose bug?

asad.ali · June 27, 2018, 1:28pm

@uaprogrammer

Thanks for getting back to us.

We have logged an investigation ticket as PDFNET-44975 in our issue tracking system, in order to investigate this behavior of the API. We will further investigate whether handling such scenarios is feasible or not. In case of further updates regarding investigation, we will surely let you know. Please be patient and spare us little time.

We are sorry for the inconvenience.

uaprogrammer · November 29, 2019, 1:18pm

Hi,

Would you be so kind to provide us with status on the PDFNET-44975 issue?

Best regards,

Oleh

asad.ali · November 29, 2019, 8:32pm

@uaprogrammer

We are afraid that earlier logged issue could not get resolved due to other high priority issues in the queue. We will surely inform you as soon as there are some certain updates regarding its resolution. Please spare us little time.

We are sorry for the inconvenience.

uaprogrammer · April 9, 2020, 1:58pm

Hi,

We are wondering if there are any updates with this issue?

BR
Oleh

asad.ali · April 9, 2020, 7:41pm

@uaprogrammer

We are afraid that earlier logged issue could not get resolved due to low priority as it was logged under normal support. However, we will surely inform you as soon as it is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

uaprogrammer · November 3, 2020, 4:38pm

Hello,

We are wondering if there are any updates regarding the issue in Aspose.

Thank you in advance.

Best regards,
Oleh

asad.ali · November 4, 2020, 7:44pm

@uaprogrammer

Thanks for contacting support.

Sadly, the earlier logged ticket has not been yet resolved due to other pending issues in the queue logged prior to yours. However, we will certainly inform you as soon as we have some definite updates regarding ticket resolution. Please spare us some time.

We are sorry for the inconvenience.

uaprogrammer · January 21, 2021, 1:52pm

Hi,

We are wondering if there are any updates with this issue?

BR
Oleh

asad.ali · January 21, 2021, 9:49pm

@uaprogrammer

We are afraid that the issue PDFNET-44975 is not yet resolved. We will surely let you know once we have some news about its resolution ETA or fix.

We are sorry for your inconvenience.