Render Bengali & Unicode Text in Word Document to PDF using OpenType Features (HarfBuzz) in C# .NET

I have used Aspose Word for constructing word document and then save it into pdf format. It works but the pdf file cannot render the Unicode content.

Is there have any way to render pdf with Utf8 supported Unicode fonts ?

@masumss,

Please ZIP and upload your 1) input Word document, 2) output PDF file showing the undesired behavior and 3) Font files here for testing. We will then investigate the issue on our end and provide you more information.

Bangla-Source-Doc.zip (552.9 KB)

Zip contains Bangla doc file and converted pdf file using Microsoft Word which is rendered correctly and the font SolimanLipi

@masumss,

We have generated a 18.2.pdf (59.3 KB) file and a msw-2016.pdf (414.1 KB) file on our end and attached them here for your reference. Please create a comparison screenshot highlighting (encircle) the problematic areas in this Aspose.Words generated PDF file (18.2.pdf) and attach it here for our reference. We will investigate the problematic areas further on our end and provide you more information. Thanks for your cooperation.

BanglaProblem.jpg (278.0 KB)

I have marked correct way at green color and incorrect way at red color. Problem is in Bangla single character compose with multiple characters. Like in English ‘Bo’ in Bangla বো । The character ে should take place before ব but your renderer renders like ব ে । All problems are with this kind of characters. Not able to combine properly just render one after another.

we use this Avro software for writing Bangla - https://www.omicronlab.com/download/setup_avrokeyboard_5.5.0.exe
This software switch keyboard between English and Bangla.

This software combines the words properly.

Thank you for your quick response.

The attached BanglaKeyboard.zip (12.0 KB) file is a Bangla typing keyboard implementation in javascript.

you can test that keyboard in the comment section of this URL: ফাগুনের গান (গীতি কবিতা) - লক্ষণ ভান্ডারী এর বাংলা ব্লগ । bangla blog | সামহোয়্যার ইন ব্লগ - বাঁধ ভাঙ্গার আওয়াজ

There the js compose Bangla properly. You can use the phonetic keyboard.

Also, you can see the source code of that software here GitHub - mugli/Avro-Keyboard: Unicode compliant native Bangla/Bengali Input Method Editor (IME) for Windows

@masumss,

We have logged this issue in our bug tracking system. The ID of this issue is WORDSNET-16533. Your thread has also been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

@masumss,

I am afraid, there are no estimates available at the moment. Currently, this issue is pending for analysis and is in the queue. We will inform you via this thread as soon as this issue is resolved. or any estimates are available. We apologize for any inconvenience.

If we save the document as Image then it renders properly but losses the quality. So you already have the proper solution for Bangla unicode font at Image rendering but not at pdf rendering. Docx-to-Image-Bangla-Rendering.jpeg (130.3 KB)

@masumss,

I am afraid, because of complexity, the implementation of this issue has been postponed till a later date (no ETA is available). Bengali script in the document requires shaping for correct rendering. Aspose.Words currently does not support the shaping for Bengali script. We have postponed this issue until ‘advanced typography’ is supported in Aspose.Words.

We have also passed these details to our product team and will update you as soon as this issue will be resolved in future.

A post was split to a new topic: Word Unicode content save to Pdf format

@masumss,

Regarding WORDSNET-16533, we have completed the work on your issue and concluded to close this issue as Not a Bug. Please see the following analysis details:

Aspose.Words can now render Bengali in this document correctly with the help of the HarfBuzz shaping engine. This requires installing the Aspose.Words.Shaping.Harfbuzz nuget package and adding an extra line of code:

var doc = new Document(@".\Bangla-Source-Doc.docx");
doc.LayoutOptions.TextShaperFactory = Aspose.Words.Shaping.HarfBuzz.HarfBuzzTextShaperFactory.Instance;
doc.Save(@".\Bangla-Source-Doc.pdf");

To learn more about OpenType features, please refer to the following article:

A post was split to a new topic: Render Bengali Text in Word Document to PDF using OpenType Features of Aspose.Words for .NET C#

The issues you have found earlier (filed as WORDSNET-16533) have been fixed in this Aspose.Words for .NET 20.8 update and this Aspose.Words for Java 20.8 update.