Font Substitution Questions/Issues

I’ve got some questions about font substitutions. We are running Aspose.Words for Java on a Unix system that has only 3 fonts installed by default - DejaVu Sans, DejaVu Serif and DejaVu Mono. This is leading to some poorly rendered images when we convert the Word document to JPEGs. This becomes more of an issue when it comes to documents with Japanese, Chinese and Korean fonts as the DejaVu fonts don’t have those characters we get boxes instead of characters. We’ve even had issues with latin fonts, a document that used “Trebuchet MS” rendered a misaligned table of contents when DejaVu Serif was substituted, the numbers were not right aligned but overlapped the line of dots (see attached picture).
I’ve implemented the WarningCallback which reports all kinds of font substitutions occurring for my document and since there are only 3 fonts it always uses DejaVu Serif. We’d like to add more fonts to our system preferably free/open source ones. Is there documentation on which fonts should be installed that will adhere to your default font substitution settings without us having to use the font substitution APIs in the FontSettings class. We’d really not like to have our application be a font substitution manager.
Also, with respect to the font substitution APIs in the FontSettings class is the addFontSubstitutes method supposed to add the list of fonts to the original list of substitutes? If that is what it is supposed to do, then it isn’t working. Here is some sample code I wrote that demonstrates.

public static void main(Sting[] args) {
     String[] fs1 = FontSettings.getFontSubstitutes("Courier");
     System.out.println("Font Substitutes for Courier: " + Arrays.asList(fs1));
     FontSettings.addFontSubstitutes("Courier", "Helvetica");
     String[] fs2 = FontSettings.getFontSubstitutes("Courier");
     System.out.println("Font Substitutes for Courier after adding Helvetica: " + Arrays.asList(fs2));
     FontSettings.setFontSubstitutes("Courier", "Helvetica");
     String[] fs3 = FontSettings.getFontSubstitutes("Courier");
     System.out.println("Font Substitutes for Courier after setting Helvetica: " + Arrays.asList(fs3)); 
}

Output:
Font Substitutes for Courier: [Courier New]
Font Substitutes for Courier after adding Helvetica: [Courier New]
Font Substitutes for Courier after setting Helvetica: [Helvetica]

When we convert from Word files to JPEG we resize them as part of the process, so we don’t use document.save() with the ImageSaveOptions, but we use document.renderToSize(), and then we save the BufferedImage created with the ImageIO.write() method. This allows us to have only 1 I/O operation the final write of the resized image, as opposed to using document.save() and then reading the saved image back in to resize. Our problem is that because we don’t use save we can’t use the WarningCallback to report font substitution warnings. Is there a way to get the font substitution warnings when using document.renderToSize()?

Thanks, Ian.

Hi Ian,

Thanks for your inquiry.

*icolomby:

I’ve got some questions about font substitutions. We are running Aspose.Words for Java on a Unix system that has only 3 fonts installed by default - DejaVu Sans, DejaVu Serif and DejaVu Mono. This is leading to some poorly rendered images when we convert the Word document to JPEGs. This becomes more of an issue when it comes to documents with Japanese, Chinese and Korean fonts as the DejaVu fonts don’t have those characters we get boxes instead of characters. We’ve even had issues with latin fonts, a document that used “Trebuchet MS” rendered a misaligned table of contents when DejaVu Serif was substituted, the numbers were not right aligned but overlapped the line of dots (see attached picture).
I’ve implemented the WarningCallback which reports all kinds of font substitutions occurring for my document and since there are only 3 fonts it always uses DejaVu Serif. We’d like to add more fonts to our system preferably free/open source ones. Is there documentation on which fonts should be installed that will adhere to your default font substitution settings without us having to use the font substitution APIs in the FontSettings class. We’d really not like to have our application be a font substitution manager.*

Please note that Aspose.Words requires TrueType fonts when
rendering documents to fixed-page formats (PDF, XPS, JPEG or SWF). Make sure
you have all the Fonts installed on your machine you’re using to convert
Word document to fixed file format. I would suggest you please read the
following articles:
https://docs.aspose.com/words/java/specify-truetype-fonts-location/
https://docs.aspose.com/words/java/using-truetype-fonts/

*icolomby:

Also, with respect to the font substitution APIs in the FontSettings class is the addFontSubstitutes method supposed to add the list of fonts to the original list of substitutes? If that is what it is supposed to do, then it isn’t working. Here is some sample code I wrote that demonstrates.
public static void main(Sting[] args) {
String[] fs1 = FontSettings.getFontSubstitutes(“Courier”);
System.out.println("Font Substitutes for Courier: " + Arrays.asList(fs1));
FontSettings.addFontSubstitutes(“Courier”, “Helvetica”);
String[] fs2 = FontSettings.getFontSubstitutes(“Courier”);
System.out.println("Font Substitutes for Courier after adding Helvetica: " + Arrays.asList(fs2));
FontSettings.setFontSubstitutes(“Courier”, “Helvetica”);
String[] fs3 = FontSettings.getFontSubstitutes(“Courier”);
System.out.println("Font Substitutes for Courier after setting Helvetica: " + Arrays.asList(fs3));
}
Output:
Font Substitutes for Courier: [Courier New]
Font Substitutes for Courier after adding Helvetica: [Courier New]
Font Substitutes for Courier after setting Helvetica: [Helvetica]*

Please make sure
you have all the Fonts installed on your machine you’re using to convert
Word document to fixed file format. If you still face problem, please share your input document here for testing. I will investigate the issue and provide you more information.

*icolomby:

When we convert from Word files to JPEG we resize them as part of the process, so we don’t use document.save() with the ImageSaveOptions, but we use document.renderToSize(), and then we save the BufferedImage created with the ImageIO.write() method. This allows us to have only 1 I/O operation the final write of the resized image, as opposed to using document.save() and then reading the saved image back in to resize. Our problem is that because we don’t use save we can’t use the WarningCallback to report font substitution warnings. Is there a way to get the font substitution warnings when using document.renderToSize()?*

You can not implement IWarningCallback for Document.RenderToSize method. I have logged this feature request as WORDSNET-10617. Our development team will look into the possibility of implementation of this requested feature. Once we have any information about this feature, we will update you via this thread.

Hi,
With respect to the my first issue on the Unix system. I understand that I need the true type fonts installed. The problem is that w e are processing the presentation with Aspose.Words on a Unix computer that doesn’t have access to the font provided by Microsoft on the Windows computer. For example I have a Word file that has MS Gothic, SimSun and Batang fonts, playing around with font substitution methods the font TakaoPGothic is recognized by Asspose.Words as a substitute for MS Gothic, but the other 2 fonts don’t have a substitute. Why is that? Having a single default font doesn’t always work unless it is a complete font that has all the unicode characters in it.
For my second issue with addFontSubstitutes. I tried that on a Macintosh and fonts I was testing with Arial, Courier and Helvetica are on the computer. I’m not sure what my document would have to do with it, the FontSettings are static methods that seem document agnostic. Setup it once and forget about it.
Lastly, should the feature request be logged against Aspose.Words for Java and not .NET as that is what I’m using?
Thanks, Ian.

Hi Ian,

Thanks for your inquiry.

*icolomby:

With respect to the my first issue on the Unix system. I understand that I need the true type fonts installed. The problem is that w e are processing the presentation with Aspose.Words on a Unix computer that doesn’t have access to the font provided by Microsoft on the Windows computer. For example I have a Word file that has MS Gothic, SimSun and Batang fonts, playing around with font substitution methods the font TakaoPGothic is recognized by Asspose.Words as a substitute for MS Gothic, but the other 2 fonts don’t have a substitute. Why is that? Having a single default font doesn’t always work unless it is a complete font that has all the unicode characters in it.*

Aspose.Words selects the fonts according to the process shared here:
https://docs.aspose.com/words/java/using-truetype-fonts/

I suggest you read about installing true type fonts on Linux operating system from here:
https://docs.aspose.com/words/java/install-truetype-fonts-on-linux/

As I shared in my previous post, Aspose.Words requires TrueType fonts when rendering documents to fixed-page formats. You need to install all fonts at your machine before converting document to Pdf.

*icolomby:

For my second issue with addFontSubstitutes. I tried that on a Macintosh and fonts I was testing with Arial, Courier and Helvetica are on the computer. I’m not sure what my document would have to do with it, the FontSettings are static methods that seem document agnostic. Setup it once and forget about it.*

It would be great if you please share following detail for investigation purposes.

  • Please attach your input Word document.
  • Please

create a standalone/runnable simple Java application that demonstrates the code (Aspose.Words code) you used to generate
your output document

  • Please attach the output Pdf file that shows the undesired behavior.
  • Please
    attach your target Pdf showing the desired behavior.

As soon as you get these pieces of information to
us we’ll start our investigation into your issue.

*icolomby:

Lastly, should the feature request be logged against Aspose.Words for Java and not .NET as that is what I’m using?*

Please note that Aspose.Words for Java is completely auto-ported from .NET, i.e. we do
not write code for Aspose.Words for Java; it is generated out
automatically from C# code of Aspose.Words for .NET. In your case, the feature that was logged with WORDSNET prefix would be auto
resolved in Java variant of Aspose.Words as well.

I’m trying to solve this issue without using the addFontSubstitute API and relying on the default substitutions provided by Aspose.Words. I’ve download some CJK true type fonts from here: https://help.ubuntu.com/community/fonts for my Unix system. Aspose.Words finds an appropriate substitute for the Japanese documents, but the for the Korean font documents I still get the empty boxes and for my Chinese font documents I get some correct characters and some empty boxes. Are the fonts listed at the URL above the ones you look for when substituting. I’ve attached a ZIP file with the documents I’m testing with and with the resulting PDF files.

Here is the code I use:

public class WordToImage {
    static {
        File fontFile = new File("~/CJK-Fonts/UnTaza.ttf");
        ArrayList fontSources = new ArrayList(Arrays.asList(FontSettings.getFontsSources()));
        FolderFontSource folderFontSource = new FolderFontSource(fontFile(), true);
        fontSources.add(folderFontSource);
        FontSourceBase[] updatedFontSources = (FontSourceBase[])fontSources.toArray(new FontSourceBase[fontSources.size()]);
        FontSettings.setFontsSources(updatedFontSources);
    }

    public static void main(String[] args) {
        try {
            File file = new File(args[0]);
            Document document = new Document(file.getInputStream());

            LOGGER.info("Save Word as PDF");
            long pdfFileName = System.currentTimeMillis();
            String rootPath = System.getProperty("user.home");
            File dir = new File(rootPath + File.separator + "tmpFiles");
            File serverFile3 = new File(dir.getAbsolutePath() + File.separator + String.valueOf(pdfFileName) + ".pdf");

            PdfSaveOptions pdfSaveOptions = new PdfSaveOptions();
            pdfSaveOptions.setWarningCallback(callback);
            document.save(serverFile3.getAbsolutePath(), pdfSaveOptions);
        } catch (Exception e) {
            LOGGER.error("Word conversion error", e);
        }
    }
}

I’ll post the code for the addFontSubstitute issue tomorrow.

Ian.

Hi Ian,

Thanks for your inquiry. We will wait for the code example and will investigate the issue. It would be great if you please share the fonts used at your side so that we are on the same page. As soon as you get these pieces of information to us we’ll start our investigation into your issue.

I’ve shared a zip file of the fonts I’m using on Dropbox. These are the fonts that I use with the code sample from the previous post of trying to convert DOC/DOCX file to JPEG and PDF.

Here is the code same for addFontSubstitutes API issue (which is separate from the rendering issue): the document is attached:

public static void main(String[] args)
{
    try
    {
        Document document = new Document("/Users/ian/Desktop/FontDoc.docx");
        String[] fs1 = FontSettings.getFontSubstitutes("Courier");
        System.out.println("Font Substitutes for Courier: " + Arrays.asList(fs1));
        FontSettings.addFontSubstitutes("Courier", "Helvetica");
        String[] fs2 = FontSettings.getFontSubstitutes("Courier");
        System.out.println("Font Substitutes for Courier after adding Helvetica (before save): " + Arrays.asList(fs2));
        document.save("/Users/ian/Desktop/FontDoc.pdf", new PdfSaveOptions());
        String[] fs3 = FontSettings.getFontSubstitutes("Courier");
        System.out.println("Font Substitutes for Courier after adding Helvetica (after save): " + Arrays.asList(fs3));
    }
    catch (Exception x)
    {
        x.printStackTrace();
    }
}

Here is the output:

Font Substitutes for Courier: [Courier New]
Font Substitutes for Courier after adding Helvetica (before save): [Courier New]
Font Substitutes for Courier after adding Helvetica (after save): [Courier New]

Am I wrong to assume that the addFontSubstitutes method should add to the list of substitutable fonts. I’d except after I add Helvetica to the font substitute lists for Courier the getFontSubstitutes call should return a list of 2 fonts: Courier New and Helvetica.

Ian.

Hi Ian,

Thanks
for sharing the detail.

*icolomby:

I’ve shared a zip file of the fonts I’m using on Dropbox. These are the fonts that I use with the code sample from the previous post of trying to convert DOC/DOCX file to JPEG and PDF.*

I have tested the scenario using the shared fonts and have found the same issue at my side. However, when I copy all fonts from my Windows machine to Linux machine and test the same scenario, I have not found any issue with output Pdf file. I have attached the output Pdf files with this post for your kind reference.

I suggest you please read about installing true type fonts on Linux operating system from here:
https://docs.aspose.com/words/java/install-truetype-fonts-on-linux/

*icolomby:

Am I wrong to assume that the addFontSubstitutes method should add to the list of substitutable fonts.*

FontSettings.addFontSubstitutes method adds substitute (alternative) font names for given original font name. For example, please check the following code example. The c:/temp contain only Cambria font. The Courier and Helvetica fonts substitute with Cambria in output Pdf file.

Document document = new Document(MyDir + "FontDoc.docx");
FontSettings.setFontsFolder("c:/temp", true);
// Adds substitute (alternative) font names for given original font name
// Helvetica is alternative font name for Georgia 
FontSettings.addFontSubstitutes("Georgia", "Cambria");
String[] fs1 = FontSettings.getFontSubstitutes("Georgia");
System.out.println("Font Substitutes for Courier: " + Arrays.asList(fs1));
FontSettings.addFontSubstitutes("Courier", "Cambria");
FontSettings.addFontSubstitutes("Helvetica", "Cambria");
String[] fs2 = FontSettings.getFontSubstitutes("Courier");
System.out.println("Font Substitutes for Courier after adding Helvetica (before save): " + Arrays.asList(fs2));
document.save(MyDir + "Out.pdf", new PdfSaveOptions());

*icolomby:

I’d except after I add Helvetica to the font
substitute lists for Courier the getFontSubstitutes call should return a
list of 2 fonts: Courier New and Helvetica.*

I am in communication with the development team about getFontSubstitutes method and will update you as soon as I have information on this.

The problem with above suggestion is that fonts installed on my Windows & computer are copyrighted by Microsoft and legally I can’t just copy them to my Unix environment. With your help I’d like to determine which are the Unix equivalent fonts that I can use.

Ian.

Hi Ian,

Thanks for your inquiry. Please note that Aspose.Words requires TrueType fonts when rendering documents to fixed-page formats (PDF, XPS, JPEG or SWF). So, you need to use True Type fonts at Linux.

Make sure you have all the Fonts installed on your machine you’re using to convert Word document to fixed file format. There are two main ways to get TrueType fonts on a Linux/Unix system:

  1. Copy .TTF and .TTC files from a Windows machine to your Linux/Unix machine.
  2. Install a TrueType fonts package, such as msttcorefonts.

Hi Tahir,

Thanks for the help. I can’t copy the fonts from Windows to Linux. I’ve installed the msttcorefonts but I’m still missing some CJK fonts. Does Aspose.Words match fonts by their name. So if I have a document that has the font Batang that was created on Windows. If I find a font called available for Linux also called Batang that will that be a match even though they aren’t the exact same font file.

Ian.

Hi Ian,

Thanks
for your inquiry.

*icolomby:

I can’t copy the fonts from Windows to Linux. I’ve installed the msttcorefonts but I’m still missing some CJK fonts.*

As this issue can not be reproduce by copying the fonts from Windows to Linux, so this is not a bug. Perhaps, you are missing font at your Linux machine. Please make sure that you have installed all CJK font.

*icolomby:

Does Aspose.Words match fonts by their
name. So if I have a document that has the font Batang that was created
on Windows. If I find a font called available for Linux also called
Batang that will that be a match even though they aren’t the exact same
font file.*

Yes, Aspose.Words tries to find a font on the file system with an exact font name match. I suggest you please read the section ‘Font Availability and Substitution’ from here:
https://docs.aspose.com/words/java/using-truetype-fonts/

I read the section you mentioned. Do the fonts set via the add/setFontSubstutites get checked before the default font name, or are they used in step #4 “it attempts to select the most suitable font from all of the available fonts”.
Thanks, Ian

Hi Ian,

Thanks
for your inquiry. I have logged a ticket as WORDSNET-10686 in
our issue tracking system about choosing suitable fonts when SetFontSubstutites/AddFontSubstitutes method are used. We will update you
via this forum thread once there is any update available on this.

Thanks for your patience.

Hi Ian,

Regrading WORDSNET-10686, currently the fonts which are set by
SetFontSubstutites/AddFontSubstitutes method are checked after the step
#1 mentioned here. So if font is not present in the system but the
substitute is found, the font is still considered as properly resolved. In this case, steps #2 to #5 are not performed.

Please let us know if you have any more queries.

I’ve got another quick question about the font usage. When putting the fonts on the Unix computer, is it good enough to just use say Arial, or should I use all the styles, Arial Bold, Arial Italic, Arial Bold Italic.
Will Aspose.Words use Java APIs to bold/italicize a font if only the regular style is present on the computer?
Thanks, Ian.

Hi Ian,

Thanks for your inquiry. You need only those fonts which you are using in your document. Please let us know if you have any more queries.

So does that mean if I use the Arial font in my document and have text in my document that is bolded I should have the both the Arial TTF file and the Arial Bold TTF file on my computer, or will the Arial font be enough.

Hi Ian,

Thanks
for your inquiry. Yes, in this case, you need both fonts Arial and Arial Bold.

Hi Ian,

Thanks
for your patience.

*icolomby:

Am I wrong to assume that the addFontSubstitutes method should add to the list of substitutable fonts. I’d except after I add Helvetica to the font substitute lists for Courier the getFontSubstitutes call should return a list of 2 fonts: Courier New and Helvetica.*

This is a bug in Aspose.Words. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-10754. I have linked this forum thread to the same issue and you will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.