PDF and non-ascii characters

We are currently using Aspose.Pdf 4.1. We’ve run into an issue where HTML to PDF is not displaying non-ascii characters correctly. If we have to update then I need to know that the latest version fully supports non-ascii characters.


Given a simple input example such as:
HANS-ROTH-STRAßE 1
The PDF that is generated shows: HANS-ROTH-STRAßE 1


Current code from v4 looks something like this (clipped for brevity):

Document htmlDocument = new Document(ms);
MemoryStream outputstream = new MemoryStream();
htmlDocument.SaveOptions.PdfExportImagesFolder = Path.GetTempPath();
htmlDocument.SaveOptions.TxtExportEncoding = Encoding.Unicode;
htmlDocument.SaveOptions.HtmlExportEncoding = Encoding.Unicode;
htmlDocument.Save(outputstream, SaveFormat.AsposePdf);
Pdf pdfDocument = new Pdf();
pdfDocument.BindXML(outputstream, null);
using(MemoryStream outputPdf = new MemoryStream())
{
pdfDocument.Save(outputPdf);


Thanks,
Ryan

Hi Ryan,


Thanks for using our products.

I have tested the scenario using Aspose.Pdf for .NET 7.7.0 while using the following code snippet and as per my observations, ASCII characters are appearing in resultant PDF file. For your reference, I have also attached the resultant PDF which I have generated with v7.7.0.

We are sorry for this inconvenience.

[C#]

Pdf pdfDocument = new Pdf();<o:p></o:p>

pdfDocument.ParseToPdf("HANS-ROTH-STRAßE 1 ");

pdfDocument.Save("c:/pdftest/UniCode_Characters.pdf");

Great, thanks.