HarfBuzz for Aspose.Cells, Aspose.Slides, Aspose.PDF

Hi!

We have, with great success, utilized the Aspose.Words.Shaping.HarfBuzz package for Word-to-pdf export of documents containing Thai and Sinhalese text. We are now on the lookout for similar solutions for Aspose.Slides, Aspose.Cells, and Aspose.PDF html-to-pdf exports. From what we’ve seen in various forum threads this is currently not supported. This feature is important and will provide high value for us. Is this on any roadmap, or are there any work-arounds to the formatting issues we’re having?

@lars.olsson,

For Aspose.Cells, could you please share your input MS Excel or HTML files (containing Thai/Sinhalese text) which you are trying to convert to PDF. We will look into it soon.

Similarly, for Aspose.PDF and Aspose.Slides APIs, please also share sample documents or HTML files for the purpose.

RenderErrors.zip (1.8 MB)
See attached zip file for pptx and xlsx files with corresponding pdf files. Some characters render correctly, while some contains minor or major issues. For thai it’s been challenging to find good examples, but you will find there are some issues with the last two examples. The files contain the characters themselves as well as the expected render, based on screenshots from Excel and Powerpoint. There appears to be minor differences due to variations in the fonts used, please ignore those.We are primarily interested in the deviations in composite characters where characters are rendered in an incorrect position.

@lars.olsson,

Thanks for the sample files.

Regarding Aspose.Cells, I tested your scenario/case. I reproduced the issue you mentioned by converting your template XLSX file to PDF. I found an issue with rendering Thai and Sinhalese text in Excel to PDF conversion. We need to evaluate in details. We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSNET-53617

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@lars.olsson,
As for PPTX to PDF conversion, please check if the fonts used in the PowerPoint presentation (Iskoola Pota, Cordia New) have been installed on the operating system when the conversion was performed. If the fonts are missing, please install them or use loading the fonts before the conversion like this:

FontsLoader.LoadExternalFonts(yourCustomFonts);

Documents: Custom Font
API Reference: FontsLoader class

We also recommend that you use the latest version of Aspose.Slides for .NET if possible.

If the issue persists, please share the following files and information:

  • font files (Iskoola Pota, Cordia New) you used
  • OS version on which the conversion was performed
  • .NET target platform in your app
  • Aspose.Slides version you used

@amjad.sahi Thank you!

@andrey.potapov Thank you, this inspired us to take a deep dive into the sea of fonts and unicode. We found the cause of the font issue, we were creating pptx files on windows 10 but generated the PDFs on windows server 2019 where a different set of default fonts are installed, so the expected fonts were not available when rendering pdf. We changed fonts to Leelawadee UI and Nirmala UI to ensure that the same fonts are installed everywhere. Unfortunately, the issue with composite characters still remain. See attached zip for files.
The issue has been observed on windows server 2019 and windows 10, dotnet framework 4.8 and dotnet 6, Aspose Slides 23.5 and 23.6

RenderErrorsCorrectFont.zip (2.4 MB)

@lars.olsson,
I reproduced the problem with text characters when converting the PPTX file to a PDF document.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): SLIDESNET-44073

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Thank you all for your quick responses, we’re grateful for the two tickets created so far. There’s just one more thing: HTML -> pdf has the same issue, and unless that’s inluded in any of the tickets, we haven’t really adressed that one yet. We’ve created a sample with a minimal html document that shows the same complex characters rendered incorrectly. Reference images are screen-shot:ed from Chrome, and looks identical in Edge and Firefox.
A pdf that has been generated by Aspose.PDF in dotnet 6 on windows 11 is included as index.pdf

html.zip (651.4 KB)

@lars.olsson

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-54926

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.