Word to PDF Khmer Unicode Font Problem

Hi supporter,

I have a question regarding unicode fonts when converting a word document to a pdf in java using Aspose word. I am running conversion java code in AWS lambda function with attaching ttf font file inside folder font but I got an unexpected result. Can you help me about this problem? (I’ve attached code and pdf file for you)

Here is java code:

public String handleRequest(S3Event event, Context context) {
context.getLogger().log("Received event: " + event);

    // Get the object from the event and show its content type
    String bucket = event.getRecords().get(0).getS3().getBucket().getName();
    String key = event.getRecords().get(0).getS3().getObject().getKey();
    try {
        S3Object response = s3.getObject(new GetObjectRequest(bucket, key));
        //String contentType = response.getObjectMetadata().getContentType();
        //context.getLogger().log("CONTENT TYPE: " + contentType);
        //return contentType;
        ByteArrayOutputStream data = new ByteArrayOutputStream();
        FontSettings.getDefaultInstance().setFontsFolder("/fonts", true);
        Document doc = new Document(response.getObjectContent());
        PdfSaveOptions options = new PdfSaveOptions();
        options.setEmbedFullFonts(true);
        options.setSaveFormat(SaveFormat.PDF);
        doc.save(data, options);
        ObjectMetadata metaContext = new ObjectMetadata();
        InputStream inputData = new ByteArrayInputStream(data.toByteArray());
        s3.putObject(new PutObjectRequest("weprint-pdf", "test.pdf", inputData, metaContext));
        return null;
    } catch (Exception e) {
        e.printStackTrace();
        context.getLogger().log(String.format(
            "Error getting object %s from bucket %s. Make sure they exist and"
            + " your bucket is in the same region as this function.", key, bucket));
        return e.getMessage();
    }
}

test.pdf (184.7 KB)

1 Like

@sethathay,

Thanks for your inquiry. We suggest you please read the following article.
How to Receive Notification of Missing Fonts and Font Substitution during Rendering

Please make sure that fonts are installed on the system where you are converting DOCX to PDF. If you still face problem, please ZIP and attach your input Word document and fonts here for testing. We will investigate the issue and provide you more information on this.

@tahir.manzoor

Thank you for your reply. I got the problem solved which i can display character inside pdf file but the rendering is not correct as the file attached. Do you have any idea what’s happen?

Note: I am using font: Khmer Unicode.

Thanks

test12345.pdf (927.3 KB)

@sethathay,

What version of Java are you using on your end (e.g. OpenJDK 8) and on what OS?
Please also ZIP and upload your input Word document and Font files here for testing.

We will then be able to investigate the issue on our end and provide you more information.

@awais.hafeez I am using JDK8. Here is my project zip file (included fonts) and docx input file.

LambdaExample.zip (583.3 KB)

input.zip (9.2 KB)

@sethathay,

While using the latest version of Aspose.Words i.e. 18.1, we managed to reproduce this issue on our end. We have logged this issue in our bug tracking system. The ID of this issue is WORDSNET-16336. Your thread has also been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.

Thank You for your support. I am looking forward to solution from your awesome team.

@sethathay,

Currently this issue is pending for analysis and is in the queue. We will inform you via this thread as soon as this issue is resolved.

@awais.hafeez

I’ve try dig more about the Unicode font and I found out that in order to rendering that kind of Unicode code, Windows has used “Unicode Scripts Processor” service for arranging input text from the input sequence to visual sequence. So I wonder that if your team do this process with the embedded Unicode font before output as PDF. I hope this tips can somehow help you to explore about solution.

Thanks

@sethathay,

We have passed this information to our product team and will keep you informed of further updates.

A post was merged into an existing topic: Lanuage Garbled

Has this problem been solved?

I need the same help

@zhongyingbai,

We need your input Word document and Aspose.Words generated output PDF file to reproduce the same issue on our end. Please follow your other thread for further proceedings.

The issues you have found earlier (filed as WORDSNET-16336) have been fixed in this Aspose.Words for .NET 21.1 update and this Aspose.Words for Java 21.1 update.