How to Render Colored Emoji Glyphs | DOCX to PDF Conversion using Java

Hi dear Aspose developers!

I get a problematic output using Aspose docx to PDF export.

Using this simple test code:

final Document doc = new Document(getClass().getResourceAsStream("test_emoji.docx"));
doc.save(Files.createTempFile("test_emoji", ".pdf").toString());

emojis are not rendered as expected in PDF export:

When in the docs I had:
(3.0 KB)

Here is my test input:
test_emoji.docx (5.9 KB)

I first encountered this problem in Aspose 20.5 jdk17 but after upgrading to the 21.6 version, the problem persists.

Could you please look into this?

Regards

Henri

@hmdebenque

Please note that Aspose.Words requires TrueType fonts when rendering document to fixed-page formats (JPEG, PNG, PDF or XPS). You need to install fonts that are used in your document on the machine where you are converting documents to PDF. Please refer to the following articles:
Using TrueType Fonts
Manipulating and Substitution TrueType Fonts

Hi @tahir.manzoor

Thanks for this quick answer. I am convinced that is not the problem as I have the whole Google Noto fonts installed and I even have emoji displayed inside my terminal. Moreover, our app is able to correctly render emojis in PDFs through apache PDFbox.
I am not really aware of the font fallback system Aspose might have, but shouldn’t it fallback to whatever font is able to render the characters when it is needed?

In the meantime, I will dig the resources you shared. Thank you.

[EDIT]
Further details with this command run on my machine ensuring I have an emoji font:

➜ ~ fc-list | grep -i “Emoji”
/usr/share/fonts/noto/NotoColorEmoji.ttf: Noto Color Emoji:style=Regular

I also updated my code:

final Document doc = new Document(getClass().getResourceAsStream(“test_emoji.docx”));
final FontSettings fontSettings = new FontSettings();
fontSettings.getFallbackSettings().loadNotoFallbackSettings();
doc.setFontSettings(fontSettings);
final FontSourceBase[] fontsSources = fontSettings.getFontsSources();
// fontsSources size = 1 with only a SystemFontSource
final List emojiFont = fontsSources[0]
.getAvailableFonts() // 2678 fonts
.stream()
.map(PhysicalFontInfo::getFullFontName)
.filter(s -> s.contains(“Emoji”))
.collect(Collectors.toList());
// no “emoji” font found

[EDIT2]
We also have a NotoFallbackSetting.xml configuration file containing:

    <!--
    U+2190..U+21FF    Arrows
    U+2300..U+23FF    Miscellaneous Technical
    U+2600..U+26FF    Miscellaneous Symbols
    U+2700..U+27BF    Dingbats
    U+1F100..U+1F1FF    Enclosed Alphanumeric Supplement
    U+1F200..U+1F2FF    Enclosed Ideographic Supplement
    U+1F300..U+1F5FF    Miscellaneous Symbols and Pictographs
    U+1F600..U+1F64F    Emoticons
    U+1F680..U+1F6FF    Transport and Map Symbols
    -->
    <Rule Ranges="2190-21FF, 2300-23FF, 2600-27BF, 1F100-1F64F, 1F680-1F6FF" FallbackFonts="Noto Color Emoji"/>

@hmdebenque

Could you please ZIP and attach your problematic and expected output PDF file here for testing? Please also share the Noto fonts that you are using for Emoji. We will investigate this issue and provide you more information on it.

Of course
here @tahir.manzoor are the original doc, the expected and the scrambled I get, along with the Noto font I have on my system.
emoji rendering error files.zip (9.6 MB)

@hmdebenque
We have tested the scenario using the latest version of Aspose.Words for Java 21.6 and get the desired output. Please check the attached PDF.
21.6.java.pdf (32.4 KB)

The font name for Emoji is ‘Segoe UI Emoji’ and it is exported correctly in output PDF.
Segoe UI Emoji.png (38.2 KB)

You need to set the font name of Emoji according to your requirement or substitute the font ‘Segoe UI Emoji’ with Noto font. Please check the members of TableSubstitutionRule class from here:

1 Like

Hi @tahir.manzoor
Thanks for the update, I will try to do something based on this.
However, the PDF result you linked is still missing some characters and the Segoe UI Emoji font is proprietary to Microsoft and distributed with Windows10 and under MS licensing. We won’t be able to include it in our products.

Can’t make it work.

I tried a new approach, generating the test doc with poi:

XWPFDocument document = new XWPFDocument();
XWPFParagraph tmpParagraph = document.createParagraph();
XWPFRun tmpRun = tmpParagraph.createRun();
tmpRun.setFontFamily("Noto Color Emoji");
tmpRun.setText("\uD83D\uDE1A\uD83E\uDD70\uD83D\uDE19\uD83D\uDE0C\uD83E\uDD29");
tmpRun.setFontSize(18);
final File docWithEmojis = File.createTempFile("test_emoji_in_doc", ".docx");
document.write(new FileOutputStream(docWithEmojis));
document.close();

Then I export it in PDF through Aspose:

new Document(new FileInputStream(docWithEmojis)).save(Files.createTempFile("test_emoji", ".pdf").toString());
  • The emojis are directly using the correct “Noto Color Emoji” font
  • The docx is rendered correctly in LibreOffice and in Google Doc
  • Aspose is still unable to render the emojis in the PDF

@hmdebenque

If you open the original.docx in MS Word, the Emoji does not render.
ms word.png (26.6 KB)

Your second document has font ‘Segoe UI Emoji’ for Emoji and it renders correctly as shown in MS Word. Please check the attached image.
emoji output.png (24.5 KB)

MS Word does not render the Emoji as Google Docs does.

For your case, we have logged this problem in our issue tracking system as WORDSNET-22379. You will be notified via this forum thread once there is an update available on it.

We apologize for your inconvenience.

@tahir.manzoor
I can’t do this: I don’t have windows nor MS Word, our systems run on Linux and “Segoe UI Emoji” is proprietary to Microsoft.
Thanks for the bug opening and all your time.

@hmdebenque

We will inform you via this forum thread once this feature is available. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

1 Like

Hello @tahir.manzoor

I still can’t render emojis in PDF. Could you have a look?
I tried still without success:

final String text = "☺😂❤️😍🤣😊🥺🙏💕😭😘👍😅";
final TxtLoadOptions options = new TxtLoadOptions();
final com.aspose.words.Document input = new com.aspose.words.Document(new ByteArrayInputStream(text
	.getBytes(StandardCharsets.UTF_8)), options);
final ByteArrayOutputStream pdf = new ByteArrayOutputStream();

input.save(pdf, new PdfSaveOptions());

final org.apache.pdfbox.pdmodel.PDDocument result = org.apache.pdfbox.pdmodel.PDDocument.load(pdf.toByteArray());
assertThat(new org.apache.pdfbox.text.PDFTextStripper().getText(result)).isEqualTo(text);

And it gives (result is not rendered correctly as there are more characters):

org.opentest4j.AssertionFailedError: 
expected: "☺😂❤️😍🤣😊🥺🙏💕😭😘👍😅"    
but was: "☺"

image.png (7.8 KB)

Note that I have all Noto fonts installed on my machine and it renders correctly as text.
There is no difference when loading default Noto fallback settings.

Thank you

@hmdebenque Unfortunately, WORDSNET-22379 has been postponed. So this feature is not yet supported by Aspose.Words.