I am evaluating aspose word for converting doc files to pdf.
My doc file has indic characters (hindi/malayalam - indian languages characters). the doc file is attached.
The doc file is created using open office and I am able to convert this to pdf using open office.
However, if I convert the doc file to PDF using aspose, the indic characters are not shown properly.
Does aspose word support indian languages?
Also attaching the converted PDF using aspose…
Thanks in advance for any help.
Hi Kishore,
- Lohit Marathi
- Liberation Serif
Hi,
I am able to get the converted pdf to display malayalam (indian language) characters.
But, Ligature substitution (Indic rendering) is not happening and hence the characters are simply displayed in the order they occur.
Do you support Ligature substitution for unicode fonts?
Thanks,
Kishore
Hi Kishore,
Hi,
I have added the following files
1. The original doc file - mal.doc
(you shoud be able to view this file using ArialUnicodeMS font)
2. The pdf created using openoffice
3. Pdf created using aspose
4. The font file I used to convert using aspose -
If you compare between the openoffice and asponse pdf you can see that ligature substitution is not happening.
Kishore
Hi Kishore,
- Lohit Marathi
- Liberation Serif
Hi,
The language I used is Malayalam.
So, I am attaching the following fonts - Lohit Marati, Lohit Malayalam and Liberation Serif.
Thanks,
Kishore
Hi Kishore,
Hi,
Here ligature substitution is not happening. From your pdf file
ക് ക -> ക്ക (here the 3 characters is substituted with this char automatically)
<style type="text/css">p { margin-bottom: 0.25cm; direction: ltr; color: rgb(0, 0, 0); line-height: 120%; }p.western { font-family: "Liberation Serif","Times New Roman",serif; font-size: 12pt; }p.cjk { font-family: "Droid Sans Fallback"; font-size: 12pt; }p.ctl { font-family: "Lohit Marathi"; font-size: 12pt; }</style>
വേക is shown wrong as വ േക
Please look at the original word doc - there is difference
Thanks,
Kishore
Hi Kishore,
Regarding WORDSNET-13503, it is to update you that the fix of this issue will be included in the 20.8 (next version) of Aspose.Words. We will inform you via this thread as soon as the next version containing the fix of this issue will be released at the start of next month.
After that you need to run the following code to get the desired output:
Document doc = new Document("E:\\Temp\\mal.doc");
doc.getLayoutOptions().setTextShaperFactory(com.aspose.words.shaping.harfbuzz.HarfBuzzTextShaperFactory.getInstance());
doc.save("E:\\Temp\\mal.AW.20.7.HarfBuzz.pdf");
Please also refer to the following page:
The PDF file generated using the above code (after applying the fix on our end) is also attached here for your reference:
- mal.AW.20.7.HarfBuzz.pdf (14.6 KB)
The issues you have found earlier (filed as WORDSNET-13503) have been fixed in this Aspose.Words for .NET 20.8 update and this Aspose.Words for Java 20.8 update.