Aspose word 24.4 docx to pdf conversion, japanese chinese korean hindi characters are not getting displayed properly, using licensed version of aspose total

trivedi365 · March 25, 2025, 5:13pm

ByteArrayInputStream inputStream = new ByteArrayInputStream(docxBytes);
Document doc = new Document(inputStream);

// Configure font settings
FontSettings fontSettings = new FontSettings();
fontSettings.setFontsFolder("Fonts", true); // Custom font directory in java resources

// Apply font settings
doc.setFontSettings(fontSettings);

// Set PDF save options
PdfSaveOptions options = new PdfSaveOptions();
options.setEmbedFullFonts(true); // Embed all fonts
options.setCompliance(PdfCompliance.PDF_A_1A); // Ensure compliance

// Save as PDF to byte array
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
doc.save(outputStream, options);
byte[] pdfBytes = outputStream.toByteArray();

code snippet for conversion, above code works fine in localhost but in uat linux based environment it is not working and it is not picking fonts from the provided folder as well. Any other idea on this to make it work.
lorem_multilang.docx (36.5 KB)

Professionalize.Discourse · March 25, 2025, 5:14pm

@trivedi365

It seems that you are experiencing issues with the display of Japanese, Chinese, Korean, and Hindi characters when converting DOCX files to PDF using Aspose.Words in a Linux-based UAT environment. Here are some steps you can take to troubleshoot and resolve this issue:

Font Availability: Ensure that the fonts required for displaying these characters are installed on your Linux server. The setFontsFolder method you are using specifies a custom directory for fonts. Make sure that the specified folder contains the necessary font files and that the application has permission to access them.
Font Settings: You are correctly applying font settings in your code. However, if the fonts are not being picked up, you might want to check the path you provided to setFontsFolder. It should be an absolute path or a path relative to the working directory of your application.
Embedding Fonts: You are already using options.setEmbedFullFonts(true);, which is good as it ensures that all fonts are embedded in the PDF. This is particularly important for non-Latin scripts. However, if the fonts are not available during the conversion process, they will not be embedded.
Testing with Different Fonts: If possible, try using different fonts that support the required character sets. Sometimes, specific fonts may not render correctly in certain environments.
Logging and Debugging: Add logging to your application to capture any errors or warnings related to font loading. This can provide insights into what might be going wrong during the conversion process.
Environment Differences: Since the code works fine on your localhost but not on the UAT server, check for differences in the environment, such as Java version, Aspose.Words version, or any other dependencies that might affect font rendering.
Documentation and Support: If the issue persists, consider checking the Aspose documentation for any specific notes regarding font handling in Linux environments. You can also reach out to Aspose support for further assistance.

By following these steps, you should be able to identify and resolve the issues with character display in your PDF conversion process.

If you need further assistance, feel free to ask!

trivedi365 · March 25, 2025, 5:16pm

Already mentioned steps are performed you can check that in snippet provided but not getting solution.

alexey.noskov · March 25, 2025, 6:58pm

@trivedi365 I cannot reproduce the problem on my side. The problem on your side might occur because the fonts used the documents are not available in the environment where documents are converted. To build an accurate document layout the fonts are required. If Aspose.Words cannot find the fonts used in the document the fonts are substituted . This might lead into the layout differences due to differences in fonts metrics. You can implement IWarningCallback to get a notification when font substitution is performed.
The following articles can be useful for you:
https://docs.aspose.com/words/java/specify-truetype-fonts-location/
https://docs.aspose.com/words/java/install-truetype-fonts-on-linux/

Please note, Aspose.Words does not read fonts packed into JAR or WAR. in such case you should either copy the fonts into the OS file system or use StreamFontSource or MemoryFontSource to get fonts from your JAR/WAR file and provide them to Aspose.Words.

trivedi365 · March 26, 2025, 5:00pm

In StreamFontSource code, missing font is hard coded, while in our use case while converting docx to pdf we will not know which fonts are missing, how to handle dynamic missing fonts in this case, any reference code for it ?

alexey.noskov · March 26, 2025, 5:53pm

@trivedi365 IF there are multiple fonts, you should simply create StreamFontSource for each of them. Then set font source using [FontSettings.setFontSources](https://reference.aspose.com/words/java/com.aspose.words/fontsettings/#setFontsSources-com.aspose.words.FontSourceBase).