Saving RTF as PDF does display fonts correctly

Using Convert Files Online - Word, PDF, HTML, JPG And Many More or Document Editor Online to save the attached rtf as a PDF does not display the font correctly. I can see the font in the editor display correctly, but on export, it does not work. It should be using a Tahoma font.

46971_Thailand_GHS_MSDS_Thai_reg_small_NoArialUnicodeMS.pdf (39.9 KB)

Hi @cgitechs

We are sorry for the inconvenience. I wil open a ticket for the dev team and we will let you know as soon as we fix this problem.

@cgitechs Could you please attach your input RTF document here for testing? We will check the issue and provide you more information.
FYI: @mlyra

Sorry about that. I meant to post the RTF instead of the pdf. Below is the RTF.

46971_Thailand_GHS_MSDS_Thai_reg_small_NoArialUnicodeMS.zip (849 Bytes)

@cgitechs Thank you for additional information. Unfortunately, I cannot reproduce the problem on my side. Here is document produced on my side using the latest 22.9 version of Aspose.Words: out.pdf (19.8 KB)

The problem might occur because Aspose.Words cannot find the fonts used in your document. Could you please ty implementing IWarningCallback to check whether Aspose.Words substitutes the fonts? This might give you a hint what the problem is.

I’ll Implement the IWarningCallback in a moment. Please see the attached image from opening the document in Word, the character should be a 3 layer character, but as you can see in your example, the top of the characters are overlapping.

I saw no font errors:

Import of element \uc is not supported in Rtf by Aspose.Words.
Import of element * is not supported in Rtf by Aspose.Words.
Import of element \defpap is not supported in Rtf by Aspose.Words.
Import of element \additive is not supported in Rtf by Aspose.Words.
Import of element \uc is not supported in Rtf by Aspose.Words.
Transparency does not conform to PDF/A standard and has been removed.

@cgitechs To get the desired output you should enable open type features. The following code produces the correct output:

Document doc = new Document(@"C:\Temp\in.rtf");
doc.LayoutOptions.TextShaperFactory = Aspose.Words.Shaping.HarfBuzz.HarfBuzzTextShaperFactory.Instance;
doc.Save(@"C:\Temp\out.pdf");

ms.pdf (28.0 KB)
out.pdf (20.6 KB)

This only appears to work if I have installed the arial-unicode-ms.ttf font in my C:\Windows\Fonts. Can you confirm it works with some other font? I have attached my program with the IWarningCallback that outputs “Arial Unicode MS font is used in the document. Line spacing could be rendered differently.” when I have the arial-unicode-ms.ttf font installed.AsposeTesting.zip (129.5 KB)

@cgitechs On my side this work with original font specified in your document - Angsana New:

I have tested with the following code:

Document doc = new Document(@"C:\Temp\in.rtf");
doc.FontSettings = new FontSettings();
doc.FontSettings.SetFontsFolder(@"C:\Temp\fonts", true);
doc.LayoutOptions.TextShaperFactory = Aspose.Words.Shaping.HarfBuzz.HarfBuzzTextShaperFactory.Instance;
doc.Save(@"C:\Temp\out.pdf");

Where C:\Temp\fonts contains only Angsana New and Tahoma fonts.

It looks like in your case this font is not availabe and Aspose.Words substitutes it, but the substitution font does not have the specified glyphs and Aspose.Words performs font fallback mechanism:
https://docs.aspose.com/words/net/manipulating-and-substitution-truetype-fonts/#font-fallback-settings-from-xml
And Arial Unicode MS is the main font where Aspose.Words looks for glyphs, which are not available in the fonts.

I was about to get the font to display by adding the below “foreach (Aspose.Words.Run run in doc.GetChildNodes(Aspose…” code. However, I’m not quite sure what it is doing to why it works. The run.Font.Name value is already set to Tahoma when I hit that line of code. I’m not sure what setting the value back to “Tahoma” actually does.

FontSubstitutionWarningCollector callback = new FontSubstitutionWarningCollector();
Document doc = new Document(@"..\..\in.rtf");
doc.WarningCallback = callback;
doc.LayoutOptions.TextShaperFactory = Aspose.Words.Shaping.HarfBuzz.HarfBuzzTextShaperFactory.Instance;

foreach (Aspose.Words.Run run in doc.GetChildNodes(Aspose.Words.NodeType.Run, true).OfType<Aspose.Words.Run>())
{
    //3584,3711
    if (run.Text.Any(a => a >= 3584 && a <= 3711))
    {
        run.Font.Name = "Tahoma";
        break;
    }
}

doc.Save(@"..\..\out.pdf");

@cgitechs We have posted our answers simultaneously. Please check my answer regarding fonts fallback mechanism.

I’m not sure where they Angsana New font is coming from. The font table in the RTF has:

  • Arial
  • Times New Roman
  • Tahoma
  • Verdana

e.x.
{\fonttbl{\f0 Arial;}{\f1 Times New Roman;}{\f2 Tahoma;}{\f3 Verdana;}{\f4 Tahoma;}}

@cgitechs You are right. Looks like MS Word does the same as Aspose.Words - uses a fallback font for this character: