Aspose.PDF ShowText operator that implements PDF Tj operator does not support Chinese characters. I am using Windows 10 OS and Aspose.PDF for .NET 21.10.1 (latest at the time of writing), and the problem exists in this version, still. Here is my code snippet:
using Aspose.Pdf;
using Aspose.Pdf.Text;
namespace ConsoleApp
{
class Program
{
static void Main(string[] args)
{
var document = new Document();
var page = document.Pages.Add();
string fontResName;
{
Font font = FontRepository.FindFont("Arial");
font.IsEmbedded = false;
page.Resources.GetFonts(true).Add(font, out fontResName);
}
{
Font font = page.Resources.Fonts[1];
var fontSize = 9.0;
var line = "CJK Unified Ideographs from BMP (wiki): 一 丁 丂 七 丄 丅 丆 万 丈 三 上 下 丌 不 与 丏";
page.Contents.Add(new Aspose.Pdf.Operators.GSave());
page.Contents.Add(new Aspose.Pdf.Operators.BT());
page.Contents.Add(new Aspose.Pdf.Operators.SelectFont(fontResName, fontSize));
page.Contents.Add(new Aspose.Pdf.Operators.MoveTextPosition(120.0, 750.0));
page.Contents.Add(new Aspose.Pdf.Operators.ShowText(line, font));
page.Contents.Add(new Aspose.Pdf.Operators.ET());
page.Contents.Add(new Aspose.Pdf.Operators.GRestore());
}
document.Save("test_chinese_characters.pdf");
}
}
}
I am using Arial font installed in my system, and word processors in my system are capable of displaying the problematic characters with no issue, i.e. it is unlikely to be a font issue. I tried Courier New and Times New Roman fonts - Aspose.PDF is able to find these fonts and render European languages glyphs with no issue, but Asian characters still do not render, so the issue is with Asian characters.
In addition to Chinese characters, at least the following character sets are also unsupported: hindi letters, Japanese kanji, Japanese hiragana, Japanese katakana, Korean alphabet, Korean hanja, Japanese hiragana from Unicode Basic Multilingual Plane (BMP), CJK Unified Ideographs from BMP.
I am aware that TextFragment class offers a better support for unicode characters, however, due to technicalities of my project I cannot use TextFragment (or TextStamp) to display text, it needs to be PDF Tj operator implemented by ShowText class in Aspose.PDF. This request is specifically about Text property unicode support in ShowText class rather than about text display in general.
Aspose_PDF_Chinese_characters_support.zip (39.4 KB)
In the attachment, please find the following: (a) example C# project/solution using latest Aspose.PDF for .NET reproducing the issue; (b) test_chinese_characters.pdf containing example of broken PDF generated by Aspose.PDF; ( c ) _unicode_characters_test.txt containing extended test suite for unicode characters, of which most of Asian characters fail the test (all characters display correctly in my word processor for Arial, Courier New, and Times New Roman fonts).
What is expected: Chinese and other characters are correctly supported by ShowText operator with no issue.
Can you please help me with this issue?