Thai Text Renders incorrectly | DOCX to PDF Conversion using Java

When converting a word file to pdf that contains Thai language, the result pdf doesn’t render Thai sentence correctly. The following image is the example pdf

image.png (120.3 KB)

The expected result pdf should be the following image.

image.png (115.8 KB)

Do you have support Thai word break in a sentence when rending PDF ?

@PAR2020

You need to enable open type features as shown below to get the correct output.

Document document = new Document(MyDir + "input.docx");
document.getLayoutOptions().setTextShaperFactory(com.aspose.words.shaping.harfbuzz.HarfBuzzTextShaperFactory.getInstance());
document.save(MyDir + "output.pdf");

Please read following article for more detail.

It only corrects the font rendering but doesn’t correct word break in Thai sentence.

  • Font rendering is corrected after setting setTextShaperFactory.
    image.png (36.4 KB)

  • Word break in Thai sentence is not corrected.
    image.png (93.7 KB)

@PAR2020

Could you please ZIP and attach your input Word document along with problematic and expected output PDF files? We will investigate the issue and provide you more information on it.

test.zip (299.9 KB)

@PAR2020

We have tested the scenario and faced missing fonts notifications. Please ZIP and attach the fonts ‘AngsanaUPC’ and ‘TH SarabunPSK’ here for further testing. Thanks for your cooperation.

Fonts.zip (431.6 KB)

@PAR2020

We have tested the scenario and managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-22393. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.