Aspose.Words and Aspose.PDF version 21.0.0 with .NET core 6 unable to print Japanese characters. Exporting html content to PDF using below code works locally but when it deployed using Docker it does not display Japanese characters but display as boxes. The html content stored in variable wordHtmlTemplate is build using string builder by appending lines. Here is the code:
using Aspose.Words;
using System.IO;
var license = new License();
license.SetLicense(Path.Combine(licenceFilePath, "AsposeLicence", "Aspose.Total.NET.lic"));
Document doc = new Document();
DocumentBuilder documentBuilder = new DocumentBuilder(doc);
documentBuilder.InsertHtml(wordHtmlTemplate, true);
MemoryStream outStream = new MemoryStream();
doc.Save(outStream, SaveFormat.Pdf);
byte[] docBytes = outStream.ToArray();
return docBytes;
@alexey.noskov Attached docx file which includes html content which has some Japanese characters but after pdf conversion those characters displayed as boxes after docker deployment. Locally on Windows machine it display correctly.
@kd2023 Thank you for additional information. The problem occurs because there are no required fonts in your Linux environment. I can reproduce the problem on a clean Linux Docker using the following simple code:
Document doc = new Document();
DocumentBuilder documentBuilder = new DocumentBuilder(doc);
documentBuilder.InsertHtml(File.ReadAllText("/temp/in.html"), true);
doc.Save(@"/temp/out_without_fonts.pdf");
however, if put a font (with east Asian glyphs) into a folder and use this folder as a font folder source, the output is correct:
Document doc = new Document();
doc.FontSettings= new FontSettings();
doc.FontSettings.SetFontsSources(new FontSourceBase[] { new SystemFontSource(), new FolderFontSource(@"/temp/fonts/", true) });
DocumentBuilder documentBuilder = new DocumentBuilder(doc);
documentBuilder.InsertHtml(File.ReadAllText("/temp/in.html"), true);
doc.Save(@"/temp/out.pdf");
@alexey.noskov, I am running application using Docker and installed required fonts using msttcorefonts-installer package for Alpine Linux distribution. But can not see “Mincho” font installed out there and still having issue while print Japanese characters. Is there any font package available for Alpine distribution which would cover most of languages rather adding specific fonts in application folder?
@kd2023 You are right, msttcorefonts-installer package does not include all MS fonts, it includes only basic fonts. Unfortunately, ,there is no single package that contain all MS fonts.
You can try installing free Noto fonts and use Noto Fonts Fallback Settings. You can load them using FontFallbackSettings.LoadNotoFallbackSettings. Or you can customize the predefined fallback setting according to the fonts available in your environment.
@alexey.noskov After setting font source path as fallback settings, does Aspose still looks into default folders first? i.e. for Linux one of default location where Aspose looks for fonts is “usr/share/fonts/”
@alexey.noskov Added below code snippet to test if it picking up Noto fonts from directory:
[HttpGet]
[Route("generatepdfv1")]
public async Task<ActionResult> GeneratePdfV1(string content)
{
if (Directory.Exists(Path.Combine(_hostEnvironment.ContentRootPath, "Fonts", "Noto")))
{
content = string.Concat(content, "Fonts available are: ", string.Join(",", Directory.GetFiles(Path.Combine(_hostEnvironment.ContentRootPath, "Fonts", "Noto"))));
}
else
{
content = string.Concat(content, "Directory does not exists");
}
content = string.Concat(content, "Is Fonts/Noto directory exists", Directory.Exists(Path.Combine(_hostEnvironment.ContentRootPath, "Fonts", "Noto")));
// Generate the PDF as a MemoryStream
using (MemoryStream stream = new MemoryStream())
{
// Create a new Document object
Document doc = new Document();
// Convert the string content to a byte array
byte[] contentBytes = System.Text.Encoding.UTF8.GetBytes(content);
// Load the byte array into the document
using (MemoryStream contentStream = new MemoryStream(contentBytes))
{
doc.RemoveAllChildren();
doc.AppendDocument(GetDocumentInstance(contentStream), ImportFormatMode.KeepSourceFormatting);
}
// Save the document as PDF
doc.Save(stream, SaveFormat.Pdf);
// Return the PDF as a file attachment
return await Task.Run(() => File(stream.ToArray(), HTTP_CONTEXT_RESPONSE_CONTENTTYPE_PDF, "Pdf1.pdf"));
}
}
private Document GetDocumentInstance(Stream stream)
{
Document document = new Document(stream);
FontSettings fontSettings = new FontSettings();
// Set the order of font sources
fontSettings.SetFontsSources(new FontSourceBase[] { new FolderFontSource(Path.Combine(_hostEnvironment.ContentRootPath, "Fonts", "Noto"), false) });
// Load Noto fallback settings
fontSettings.FallbackSettings.LoadNotoFallbackSettings();
// Disable default font substitutions
fontSettings.SubstitutionSettings.DefaultFontSubstitution.Enabled = false;
document.FontSettings = fontSettings;
return document;
}
Above code snippet trying to export Japanese string to pdf file. Here is output after passing Japanese string as input
image.png (53.3 KB)
But still printing Japanese characters as boxes.
@kd2023 Unfortunately, I still cannot reproduce the problem on my side. Here is the PDF document produced on my side with Noto Japanese fonts: out.pdf (9.8 KB)
@alexey.noskov Below code snippet seems working for simple Japanese input string but not for Document which is built by appending multiple document which has multiple sections. How it could be applied to whole document content?
// Specify the font folder
FontSettings fontSettings = new FontSettings();
fontSettings.SubstitutionSettings.DefaultFontSubstitution.Enabled = true;
fontSettings.SetFontsFolder(Path.Combine("Fonts"), true);
fontSettings.FallbackSettings.LoadNotoFallbackSettings();
fontSettings.SubstitutionSettings.FontInfoSubstitution.Enabled = false;
document.FontSettings = fontSettings;
@kd2023 Please make sure FontSettings are specified for the final document, which is saved as PDF. Could you please save the output document as DOCX and as PDF and attach them here for testing? If convert the generated DOCX document to PDF, does the final PDF look fine?