How to generate Chinese character set

We have a need to generate a PDF file in Chinese. However, when we generate it, all the Chinese characters are little squares. Obviously, it's not using the correct character set.

I've searched the help file but can't seem to find anything about this. How does one go about setting the character set of the file? Can it be done for the whole file, or does it have to be done for each individual paragraph?

Thanks!

Cynthia

Hello Cynthia,

There are two ways to embedding font with Aspose.Pdf: Embedding complete font or font subset.

To embed complete font, you can set IsFontEmbedded to true. The complete font file will be embedded into the PDF and the size of the PDF might be larger.

To embed font subset, you can set IsUnicode to true. That means not the complete font file is embedded, but the subset that is used in the PDF is embedded. The file size of the PDF may be smaller than that of complete embedding.

You can set IsUnicode before saving the PDF like this:

pdf.SetUnicode();<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

pdf.Save();

OK, I don't seem to have a "pdf.SetUnicode()" method available. I assume that "pdf" is an object of type Aspose.Pdf.Pdf?

I must tell you I have an older version: 3.4.70. Before our subscription ran out, I tried updating to 3.6.1.0; however, a raft of old problems started showing up (problems with tables not appearing correctly at the top of columns; tables overwriting other tables .. etc.). These were the kind of problems I had seen when I first started using Aspose.Pdf, but they were fixed in 3.4.7.0. I did not want to hassle with these problems again, so I never upgraded to 3.6.1.0.

Anyway ... I have discovered that the Chinese will show up if I set each text object's "isUnicode" property to true and use "Arial Unicode MS" font. But, I can't use isUnicode=true on English or Spanish (or probably any language using Roman alphabet) because the justification is lost (we are using full justification). So I have to selectively set isUnicode.

So do I have to set isUnicode to each text object, is that the only way?

Thanks,

Cynthia

Hello Cynthia,

Kindly visit Font Handling for information on using Unicode property.

Hi Cynthia,

The SetUnicode() method is added in some recent version. Please check if you are using the latest version.

If you are creating PDF through API or XML (but not converting from Word), you can set IsUnicode to true for Chinese string.

It looks like I will have to upload the Arial Unicode MS font to the server.

Could you please tell me the format of the TruetypeFontMap.xml file that I have to use to point to the font file? I can't seem to find any information about it on the site.

UPDATE: I was able to get the format of the xml file by automatically generating it. I then removed everything from the file except Arial Unicode MS. I set the path to a local Fonts directory instead of the system one. Then I moved the Arial Unicode MS font file up to the server, as well as the xml file.

However, it is still not finding the font on the remote server. It still comes out with Times New Roman instead of Arial Unicode MS.

Do you know how this is done?

Hello Cynthia,

The Aspose.Pdf.TruetypeFontMap.xml is used to improve the performance. You can use the TruetypeFontMapPath to specify another path for this file with access privilege for it or set IsTruetypeFontMapCached to false to disable this function.

pdf.TruetypeFontMapPath = somepath;

A font map file named Aspose.Pdf.TruetypeFontMap.xml will be created at "somepath". If you have already used fontmap, please delete the xml file and let it regenerate. Then check the Arial Unicode MS font in the xml file.

Yes I understand all that! Please stop giving me your "canned" answers!

If you will re-read my post, I said "I was able to get the format of the xml file by automatically generating it." And I understand that TruetypeFontMapPath gives the path of this file.

The problem I am having is: INSIDE the TruetypeFontMap.xml file, I gave the "Arial Unicode MS" a DIFFERENT path from the system path. Then I uploaded the Arial Unicode MS font file to the location specified in the TruetypeFontMap.xml file. However, Apose.Pdf DID NOT find the Arial Unicode MS file. Instead it used Times New Roman.

Here is the contents of the TruetypeFontMap.xml file:

<?xml version="1.0" encoding="utf-8"?>


E:\web\sarcwriting\htdocs\Fonts\ARIALUNI.TTF
Arial Unicode MS
Arial Unicode MS
Arial Unicode MS
Arial Unicode MS
null

Could you tell me why Aspose.Pdf is not looking in the E:\web\sarcwriting\htdocs\Fonts\ directory to find the font file?

Hi,

Sorry for Nayyer's useless reply.

I tested this issue but I can't reproduce this error. Can you please provide your code and let us check it?

Please also make sure you have set the font map path correctly like the following:

pdf.IsTruetypeFontMapCached = true;
pdf.TruetypeFontMapPath = @"d:\test";

The Aspose.Pdf.TruetypeFontMap.xml should be in the TruetypeFontMapPath folder.

OK I got it to work!

It turns out I have to have ALL the fonts in Aspose.Pdf.TruetypeFontMap.xml, not just the Arial Unicode MS one. Once I did that, it started picking up Arial Unicode MS.

Thanks for your interest in my problem!

Cynthia