Converting word document with bullets do not render correctly when converted to Image. Post Conversion the bullets are missing.
Below is source document
Format Resume.docx (13.7 KB)
Below is the converted image
page1.jpeg (26.1 KB)
Why is this happening.
Regards
Anit
@anitnair I cannot reproduce the problem on my side. Here is the image produced by the following simple code:
Document doc = new Document("C:\\Temp\\in.docx");
doc.save("C:\\Temp\\out.png");
out.png (6.2 KB)
Most likely the problem on your side occurs because the fonts required to render bullets are not available in the environment where the document is converted. If Aspose.Words cannot find the font used in the document, the font is substituted . This might lead into fonts mismatch and document layout due to the different fonts metrics. You can implement IWarningCallback to get notifications when font substitution is performed.
Please see our documentation to learn where Aspose.Words looks for fonts:
https://docs.aspose.com/words/java/specifying-truetype-fonts-location/
Hi @alexey.noskov ,
I believe you are executing this on a system where Windows bullet fonts are available. In my case, I am running it in a Kubernetes container. We are not bundling the windows fonts in our container, is there a way to substitute these MS bullet fonts by any open source fonts.
Regards
Anit
@anitnair You can try using free Google Noto fonts:
https://docs.aspose.com/words/java/manipulate-and-substitute-truetype-fonts/#predefined-font-fallback-settings-for-google-noto-fonts
But bullets in MS Word documents use Windows Symbol and Wingdings fonts, so using another fonts might lead into the rendering issues.
Thanks @alexey.noskov , in such case what would be the best way to avoid such issues as MS Word Fonts are licensed and mostly we would be using Open Source Fonts. Since there are high chances Word document is a MS Word document this missing font issue will be encountered for sure. What can be best way to fix this missing font issue specifically with MS Fonts
@anitnair The problem is that Windows “Symbol” font is a symbolic font (like “Webdings”, “Wingdings”, etc.) which uses Unicode PUA. MacOS or Linux “Symbol” as well as Noto Symbol font on the other hand is a proper Unicode font (for example Greek characters are in the U+0370…U+03FF Greek and Coptic block). So these fonts are incompatible and Mac/Linux “Symbol” font cannot be used instead of Windows “Symbol” without additional actions. For example the standard bullet in Windows Symbol font is U+F0B7
(or U+00B7
which also can be used in MS Word for symbolic fonts), but in Unicode font it is U+2022
.
// The folder contains Noto Sans fonts https://fonts.google.com/noto
FontSettings.getDefaultInstance().setFontsSources(new FontSourceBase[] { new FolderFontSource("C:\\temp\\fonts", true) });
FontSettings.getDefaultInstance().getFallbackSettings().loadNotoFallbackSettings();
Document doc = new Document("C:\\Temp\\in.docx");
doc.setWarningCallback(new FontSubstitutionWarningCollector());
for (com.aspose.words.List lst : doc.getLists())
{
for (com.aspose.words.ListLevel level : lst.getListLevels())
{
// Replace xF0B7 symbol with x2022
if (level.getFont().getName().equals("Symbol") && level.getNumberFormat().equals("\uF0B7"))
level.setNumberFormat("\u2022");
}
}
doc.save("C:\\Temp\\out.png");
So unfortunately, there is no simple way to replace Windows Symbolic fonts without additional actions.
Hi @alexey.noskov , Is it safe to check ‘xFOB7’ for all bullet fonts and simply replace with x2022, Meaning all Windows Bullet fonts shall be replaced with x2022
@anitnair xFOB7
is a black dot bullet. Other types of bullets should be processed separately.
Is there a way or convention to commonly identify the list of bullet fonts and check if in this list fallback to a substitute.
@anitnair You can use the above proposed code to process bullets. Unfortunately, there is no ready to use solution to process all types of bullets.
Hi @alexey.noskov ,
Iterating through each list and within that each list level wouldn’t that bring in performance degradation for each conversion.
Regards
Anit
@anitnair It may be a slight degradation, but unfortunately there is no easy solution.