Unsupported Special Characters


#1

We’re doing a POC with Aspose PDF’s trial version and found that some special characters(Chinese language characters) are getting corrupted in the generated PDF.
Here’s the sample characters that are in HTML file but getting corrupted in PDF.
的企业正在使用或导入物联网,而中国更是高达然而
考虑到技术的复杂性和多样性

Please let us know if it can be resolved so that we could buy its paid version.


#2

@tarasing

Would you please make sure that you have all necessary fonts installed in your system that support Chinese characters. For example, you can try installing Arial Unicode MS font which supports maximum language characters. We have tried to generate PDF file in our environment using Aspose.PDF for .NET 19.5 and Chinese characters were showing correctly in it.
chinesehtml.pdf (258.9 KB)

In case you still face any issue after installing fonts, please share your sample HTML file in .zip format with us. We will further test the scenario in our environment and address it accordingly.


#3

@asad.ali
Thanks for the quick reply. I’ve installed the said font and can see the font in MS Word Application. However in the generated PDF it is still being corrupted. We’re using Aspose.PDF for Java.
I’ve attached the basic HTML herewith. I tried both with and without font family which is mentioned below.
span style=“font-family: Arial @Arial Unicode MS”
But it wasn’t working in any case.

chinesehtml.zip (397 Bytes)


#4

@tarasing

We have tested the scenario with Aspose.PDF for Java 19.4 and were still unable to observe the issue. Please check following used code snippet and attached output PDF document:

HtmlLoadOptions options = new HtmlLoadOptions();
Document doc = new Document(dataDir + "chinesehtml.html", options);
doc.save(dataDir + "HTMLtoPDF_19.4.pdf");

HTMLtoPDF_19.4.pdf (185.7 KB)

Would you please try to set font directory as following before performing conversion. Please try specifying the exact path where installed fonts are present in your system:

FontRepository.addLocalFontPath("d:/fonts/Times/");

In case issue still persists, please share some information about your environment i.e. OS Name and version, JDK Version you are using, Application Type, etc. We will further proceed to help you accordingly. It would also be helpful if you can please share a simple console application, which is able to reproduce the error.


#5

@asad.ali
Yeah it worked after loading fonts using “FontRepository.addLocalFontPath”.
However its not working in Linus environment. I have installed ‘msttcore-fonts-2.0-3.noarch.rpm’ in Linux env and then loaded the fonts from its respective directory just like above but it still not working. Interestingly this time none of the special characters coming unlike earlier when some of the characters were corrupted. It looks like in Linux machine there are no fonts installed whatsoever.
Can you please guide me how to fix this issue in Linux.


#6

@tarasing

It seems like API is unable to load correct font directory in Linux environment. Would you please make sure that required fonts are present in the system and if you are setting correct path for those fonts. Otherwise, please share Linux version details with us. We will test the scenario in our environment and address it accordingly.


#7

@asad.ali
Please check the environment details below.
NAME=“Amazon Linux”
VERSION=“2”
ID=“amzn”
ID_LIKE=“centos rhel fedora”
VERSION_ID=“2”
PRETTY_NAME=“Amazon Linux 2”
ANSI_COLOR=“0;33”
CPE_NAME=“cpe:2.3:o:amazon:amazon_linux:2”
HOME_URL=“https://amazonlinux.com/

Below is the line that I’m using to load the fonts from the path where the fonts have been installed.
FontRepository.addLocalFontPath("/usr/share/fonts/msttcore/");


#8

@tarasing

Thanks for providing these details.

Would you please try to run FontRepository.getLocalFontPath() method to see in which directories API is searching for the fonts and share the output with us.

Also, please make sure that font is present in above root directory, not in sub directory present at the path.


#9

@asad.ali
I’ve checked existing local path before setting it and it was [/usr/share/fonts/] while after running “FontRepository.addLocalFontPath” command its value gets updated to [/usr/share/fonts/, /usr/share/fonts/msttcore/].
It adds the path given by me in its collection. The installed fonts are indeed present in “/usr/share/fonts/msttcore/” directory.


#10

@tarasing

Thanks for sharing all requested details.

Please share the output PDF file generated at your end. We will log an investigation ticket in our issue tracking system for detailed investigation of this scenario and share the ID with you.


#11

@asad.ali
Please find the output file attached herewith. Please log the ticket on priority and let me know if you find anything.
Thanks.
IMG_1928_2.pdf (103.9 KB)


#12

@tarasing

Thank you for sharing requested data.

We have logged an investigation ticket with ID PDFJAVA-38601 in our issue management system and will let you know as soon as any further update will be available in this regard.


#13

@asad.ali @Farhan.Raza
Please let us know if there’s any progress or is there any other way we could use it.


#14

@tarasing

The investigation ticket has been logged recently and it is pending for analysis. Please note that issues are investigated and resolved on first come first serve basis in Free Support Model. As soon as we have some definite updates regarding issue resolution, we will surely let you know. Please spare us little time.

We are sorry for the inconvenience.