Korean characters issue when saving word doc as pdf

Try using this snippet:

	String documentsPath = "d:\\temp\\pdfissue\\";
	
	Document doc = new Document();
	DocumentBuilder builder = new DocumentBuilder(doc);
	 
	builder.writeln("Hello World! SOP-751 동의서.docx");
	
	doc.save(documentsPath + "KoreanCharactersIssue.docx");
	doc.save(documentsPath + "KoreanCharactersIssue.pdf", com.aspose.words.SaveFormat.PDF);

I used version 17.9 of aspose-words library.
When you open the docx version in MS Word it looks correct.
When you open the pdf version then korean chars are replaced by rectangle placeholders.

Thanks,
martin

@konopka

Thanks for your inquiry. We have tested the scenario and we are unable to notice the reported issue. It seems missing font issue at your end. You can implement IWarningCallback to double check this. Please install missing true type fonts, hopefully it will resolve the issue.
KoreanCharactersIssue.pdf (20.3 KB)

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.writeln("Hello World! SOP-751 동의서.docx");
doc.setWarningCallback(new HandleDocumentWarnings());
doc.setWarningCallback(new HandleDocumentWarnings());
doc.save("KoreanCharactersIssue.pdf", com.aspose.words.SaveFormat.PDF);

////////

class HandleDocumentWarnings implements IWarningCallback
{
    /// <summary>
    /// Our callback only needs to implement the "Warning" method. This method is called whenever there is a
    /// Potential issue during document procssing. The callback can be set to listen for warnings generated during document
    /// Load and/or document save.
    /// </summary>
    public void warning(WarningInfo info)
    {
        // We are only interested in fonts being substituted.
        if (info.getWarningType() == WarningType.FONT_SUBSTITUTION)
        {
            System.out.println("Font substitution: " + info.getDescription());
        }
    }

	
}

I added the callback, but there were no font substitution warnings:

Saving as docx
Fonts are not embedded, despite either 'EmbedSystemFonts' or 'SaveSubsetFonts' is set to 'true' due to 'EmbedTrueTypeFonts' option is set to 'false'.
Saving as pdf
Sep 28, 2017 9:46:53 AM java.util.prefs.WindowsPreferences <init>
WARNING: Could not open/create prefs root node Software\JavaSoft\Prefs at root 0x80000002. Windows RegCreateKeyEx(...) returned error code 5.

So I tried to set EmbedTrueTypeFonts to true, but no difference.
I am attaching output files that I get here KoreanCharactersIssue.pdf (41.6 KB)
pdfissue.zip (45.1 KB)

This is what I see when I open the pdf version btw: image.png (25.8 KB)

Can you give any other advice as to what else could be wrong here?

Thanks,
martin

@konopka

Thanks for your feedback. Please convert the word output document KoreanCharactersIssue.docx to PDF using MS Word and share the results here. Please also share your environment details.

Here is the doc exported from ms word as pdf: KoreanCharactersIssueSavedFromMSWord.pdf (15.4 KB)

The same issue happens on multiple environments:
My dev machine: MS Windows 10 x64

But we use Aspose libs inside web application that is deployed on Linux servers [Ubuntu/RedHat] and the same issue is occuring there.

What other information do you need about environments?

@konopka

Thanks for your feedback. We are testing the scenario on Windows 10 and we will share our findings here shortly.

@konopka

Thanks for your patience. Please note you are not specifying font while adding text, so API is taking system default font ‘Times New Roman’. It does not contain the glymps of Korean Text. Please specify appropriate font as following, it will resolve the issue.

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.getFont().setName("Malgun Gothic");
builder.writeln("Hello World! SOP-751 동의서.docx");
doc.save("KoreanCharactersIssue.docx");
doc.save("KoreanCharactersIssue.pdf" ,com.aspose.words.SaveFormat.PDF);

Hi,
this solution is unfortunately not acceptable for us.
The reason is, we are generating/manipulating the document with text that comes from user input.
How come that the document saved as word document is OK, but when you save it as pdf then it FAILS to render those characters?
It is clear there must be some font installed on the machine(s) that can render the characters correctly for word version. Why isn’t pdf generator using the same?

Thanks,
martin

@konopka

Thanks for your feedback. After initial investigation, we have logged a ticket WORDSNET-15965 in our issue tracking system for further investigation and rectification. We will keep you updated about the issue resolution progress within this forum thread.

The issues you have found earlier (filed as WORDSNET-15965) have been fixed in this Aspose.Words for .NET 18.10 update and this Aspose.Words for Java 18.10 update.