Missing fonts cause Aspose.Words process to stuck

Hello,

we’ve been heavily testing Aspose Java Word for documents to PDF conversion and we found a few documents that cause the underlying Aspose process to stuck.

This happens when we try to instantiate the Document class by providing a certain Word docx document that uses fonts, which have not been previously loaded by FontSettings.

No exception is thrown, Aspose utilizes the CPU at 100% without a possibility to recover.
We mitigate this issue by spinning a new thread for each Document instance. If the conversion process (load document and save it as PDF) does not happen during a certain time period, we kill the thread and the underlying Aspose process with it.

If all fonts are available we haven’t observed this behavior.

The majority of documents with missing fonts will be converted. Either with substituted fonts or with missing characters. However, a few documents have fatal effects.

This is especially concerning because we plan to use Aspose on the server and we cannot guarantee that all documents will contain only available fonts.

Do you have any advice on how to proceed and properly handle this problem?
Our solution with time outing threads seems more like a hack than a proper fix.

Thank you for your help in advance!

@ondrs Could you please attach the problematic documents here for testing and provide a simple code or test application that will allow us to reproduce the problem? We will check the issue and provide you more information.

@alexey.noskov is it possible to send you the file via a non-public channel?

@ondrs It is safe to attach documents in the forum. Only you as a topic starter and Aspose staff can see the attachments. Also, you can send the files via private message in the forum.

We are running on Aspose Java 22.4

OS is not relevant as the problem occurs in every environment but I am attaching the OS info that we are using in our docker container regardless and for clarity.

OS info:

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7

JVM info:

openjdk version "11.0.6" 2020-01-14 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.6+10-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.6+10-LTS, mixed mode, sharing)

Fonts loaded in Aspose:

"DejaVu Sans Bold"
"DejaVu Sans Bold Oblique"
"DejaVu Sans ExtraLight"
"DejaVu Sans Oblique"
"DejaVu Sans"
"DejaVu Sans Condensed Bold"
"DejaVu Sans Condensed Bold Oblique"
"DejaVu Sans Condensed Oblique"
"DejaVu Sans Condensed"
"Liberation Mono Bold"
"Liberation Mono Bold Italic"
"Liberation Mono Italic"
"Liberation Mono"
"Liberation Sans Bold"
"Liberation Sans Bold Italic"
"Liberation Sans Italic"
"Liberation Sans"
"Liberation Serif Bold"
"Liberation Serif Bold Italic"
"Liberation Serif Italic"
"Liberation Serif"
"C059-BdIta"
"C059-Bold"
"C059-Italic"
"C059-Roman"
"D050000L"
"NimbusMonoPS-Bold"
"NimbusMonoPS-BoldItalic"
"NimbusMonoPS-Italic"
"NimbusMonoPS-Regular"
"NimbusRoman-Bold"
"NimbusRoman-BoldItalic"
"NimbusRoman-Italic"
"NimbusRoman-Regular"
"NimbusSans-Bold"
"NimbusSans-BoldItalic"
"NimbusSans-Italic"
"NimbusSans-Regular"
"NimbusSansNarrow-Bold"
"NimbusSansNarrow-BoldOblique"
"NimbusSansNarrow-Oblique"
"NimbusSansNarrow-Regular"
"P052-Bold"
"P052-BoldItalic"
"P052-Italic"
"P052-Roman"
"URWBookman-Demi"
"URWBookman-DemiItalic"
"URWBookman-Light"
"URWBookman-LightItalic"
"URWGothic-Book"
"URWGothic-BookOblique"
"URWGothic-Demi"
"URWGothic-DemiOblique"
"Z003-MediumItalic"

Files:
RefField (1).docx (26.4 KB)
Tables in header footer with TrCh.docx (102.6 KB)
RefField revised (1).docx (26.9 KB)

Now, if I load the license and call this code with one of the files from above

Document doc = new Document("Input.docx");
doc.save("Output.pdf");

the entire process will stuck while utilizing the CPU at 100%.

If the license is not loaded, the conversion “will work”. It’s not really useful since the number of pages is limited and they are all watermarked.

Loading more fonts will solve the problem - basically, the entire MS Office font set. Here is the complete list:

"DejaVu Sans Bold"
"DejaVu Sans Bold Oblique"
"DejaVu Sans ExtraLight"
"DejaVu Sans Oblique"
"DejaVu Sans"
"DejaVu Sans Condensed Bold"
"DejaVu Sans Condensed Bold Oblique"
"DejaVu Sans Condensed Oblique"
"DejaVu Sans Condensed"
"Liberation Mono Bold"
"Liberation Mono Bold Italic"
"Liberation Mono Italic"
"Liberation Mono"
"Liberation Sans Bold"
"Liberation Sans Bold Italic"
"Liberation Sans Italic"
"Liberation Sans"
"Liberation Serif Bold"
"Liberation Serif Bold Italic"
"Liberation Serif Italic"
"Liberation Serif"
"C059-BdIta"
"C059-Bold"
"C059-Italic"
"C059-Roman"
"D050000L"
"NimbusMonoPS-Bold"
"NimbusMonoPS-BoldItalic"
"NimbusMonoPS-Italic"
"NimbusMonoPS-Regular"
"NimbusRoman-Bold"
"NimbusRoman-BoldItalic"
"NimbusRoman-Italic"
"NimbusRoman-Regular"
"NimbusSans-Bold"
"NimbusSans-BoldItalic"
"NimbusSans-Italic"
"NimbusSans-Regular"
"NimbusSansNarrow-Bold"
"NimbusSansNarrow-BoldOblique"
"NimbusSansNarrow-Oblique"
"NimbusSansNarrow-Regular"
"P052-Bold"
"P052-BoldItalic"
"P052-Italic"
"P052-Roman"
"URWBookman-Demi"
"URWBookman-DemiItalic"
"URWBookman-Light"
"URWBookman-LightItalic"
"URWGothic-Book"
"URWGothic-BookOblique"
"URWGothic-Demi"
"URWGothic-DemiOblique"
"Z003-MediumItalic"
"Symbol"
"Webdings"
"Wingdings 2"
"Wingdings 3"
"Wingdings"
"Abadi MT Condensed Extra Bold"
"Abadi MT Condensed Light"
"Angsana New"
"Angsana New Bold"
"Angsana New Italic"
"Angsana New Bold Italic"
"Arial"
"Arial Bold"
"Arial Bold Italic"
"Arial Italic"
"Arial Narrow Bold Italic"
"Arial Narrow Italic"
"Arial Rounded MT Bold"
"Arial Black"
"Baskerville Old Face"
"Batang"
"BatangChe"
"Gungsuh"
"GungsuhChe"
"Bauhaus 93"
"Bell MT"
"Bell MT Bold"
"Bell MT Italic"
"Bernard MT Condensed"
"Book Antiqua Bold Italic"
"Book Antiqua Bold"
"Book Antiqua Italic"
"Book Antiqua"
"Bookman Old Style Bold Italic"
"Bookman Old Style Bold"
"Bookman Old Style Italic"
"Bookman Old Style"
"Bookshelf Symbol 7"
"Braggadocio"
"Britannic Bold"
"Calibri"
"Calibri Bold"
"Calibri Italic"
"Calibri Light"
"Calibri Light Italic"
"Calibri Bold Italic"
"Calisto MT Bold"
"Calisto MT Italic"
"Calisto MT"
"Calisto MT Bold Italic"
"Cambria"
"Cambria Math"
"Cambria Bold"
"Cambria Italic"
"Cambria Bold Italic"
"Candara"
"Candara Bold"
"Candara Italic"
"Candara Bold Italic"
"Century Gothic Bold Italic"
"Century Gothic Bold"
"Century Gothic Italic"
"Century Gothic"
"Century Schoolbook Bold Italic"
"Century Schoolbook Bold"
"Century Schoolbook Italic"
"Century Schoolbook"
"Century"
"Colonna MT"
"Comic Sans MS Bold"
"Consolas"
"Consolas Bold"
"Consolas Italic"
"Consolas Bold Italic"
"Constantia"
"Constantia Bold"
"Constantia Italic"
"Constantia Bold Italic"
"Cooper Black"
"Copperplate Gothic Bold"
"Corbel"
"Corbel Bold"
"Corbel Italic"
"Corbel Bold Italic"
"Cordia New"
"Cordia New Bold"
"Cordia New Bold Italic"
"Cordia New Italic"
"CordiaUPC"
"CordiaUPC Bold"
"CordiaUPC Bold Italic"
"CordiaUPC Italic"
"Curlz MT"
"David"
"David Bold"
"DengXian Regular"
"DengXian Bold"
"DengXian Light"
"Desdemona"
"Dubai Bold"
"Dubai Light"
"Dubai Medium"
"Dubai Regular"
"Edwardian Script ITC"
"Engravers MT"
"Engravers MT Bold"
"Eurostile Bold"
"Eurostile"
"FangSong"
"Footlight MT Light"
"Franklin Gothic Book Italic"
"Franklin Gothic Book"
"Franklin Gothic Demi Cond"
"Franklin Gothic Demi Italic"
"Franklin Gothic Demi"
"Franklin Gothic Heavy Italic"
"Franklin Gothic Heavy"
"Franklin Gothic Medium Cond"
"Franklin Gothic Medium Italic"
"Franklin Gothic Medium"
"Gabriola"
"Garamond"
"Garamond Bold"
"Garamond Bold Italic"
"Garamond Italic"
"Gautami"
"Gautami Bold"
"Gill Sans MT Bold Italic"
"Gill Sans MT Bold"
"Gill Sans MT Condensed"
"Gill Sans MT Ext Condensed Bold"
"Gill Sans MT Italic"
"Gill Sans MT"
"Gill Sans Ultra Bold"
"Gloucester MT Extra Condensed"
"Goudy Old Style Bold"
"Goudy Old Style Italic"
"Goudy Old Style"
"Gulim"
"GulimChe"
"Dotum"
"DotumChe"
"Haettenschweiler"
"Harrington"
"HGGothicE"
"HGPGothicE"
"HGSGothicE"
"HGMinchoE"
"HGPMinchoE"
"HGSMinchoE"
"HGSoeiKakugothicUB"
"HGPSoeiKakugothicUB"
"HGSSoeiKakugothicUB"
"HGMaruGothicMPRO"
"Microsoft Himalaya"
"Imprint MT Shadow"
"KaiTi"
"Kartika"
"Kartika Bold"
"Kino MT"
"Latha"
"Latha Bold"
"Lucida Console"
"Lucida Sans Demibold Italic"
"Lucida Sans Demibold Roman"
"Lucida Sans Italic"
"Lucida Sans Unicode"
"Lucida Sans Regular"
"Lucida Blackletter"
"Lucida Bright"
"Lucida Bright Demibold"
"Lucida Bright Demibold Italic"
"Lucida Bright Italic"
"Lucida Calligraphy Italic"
"Lucida Fax Demibold"
"Lucida Fax Demibold Italic"
"Lucida Fax Italic"
"Lucida Fax Regular"
"Lucida Handwriting Italic"
"Lucida Sans Typewriter Bold"
"Lucida Sans Typewriter Bold Oblique"
"Lucida Sans Typewriter Oblique"
"Lucida Sans Typewriter Regular"
"Malgun Gothic"
"Malgun Gothic Bold"
"Malgun Gothic Semilight"
"Mangal"
"Mangal Bold"
"Marlett"
"Matura MT Script Capitals"
"Meiryo"
"Meiryo Italic"
"Meiryo UI"
"Meiryo UI Italic"
"Meiryo Bold"
"Meiryo Bold Italic"
"Meiryo UI Bold"
"Meiryo UI Bold Italic"
"MingLiU"
"PMingLiU"
"MingLiU_HKSCS"
"MingLiU-ExtB"
"PMingLiU-ExtB"
"MingLiU_HKSCS-ExtB"
"Mistral"
"Myanmar Text"
"Myanmar Text Bold"
"Modern No. 20"
"Mongolian Baiti"
"Monotype Corsiva"
"Monotype Sorts"
"MS Reference Sans Serif"
"MS Reference Specialty"
"MS Gothic"
"MS UI Gothic"
"MS PGothic"
"Microsoft JhengHei"
"Microsoft JhengHei Bold"
"MS Mincho"
"MS PMincho"
"Microsoft YaHei"
"Microsoft YaHei Bold"
"Microsoft YaHei Light"
"Microsoft Yi Baiti"
"MT Extra"
"News Gothic MT Bold Italic"
"News Gothic MT Bold"
"News Gothic MT Italic"
"News Gothic MT"
"Microsoft New Tai Lue"
"Microsoft New Tai Lue Bold"
"Nyala"
"Onyx"
"Palatino Linotype"
"Palatino Linotype Bold"
"Palatino Linotype Bold Italic"
"Palatino Linotype Italic"
"Perpetua Bold Italic"
"Perpetua Bold"
"Perpetua Italic"
"Perpetua Titling MT Bold"
"Perpetua Titling MT Light"
"Perpetua"
"Rockwell Bold Italic"
"Rockwell Bold"
"Rockwell Condensed Bold"
"Rockwell Condensed"
"Rockwell Extra Bold"
"Rockwell Italic"
"Rockwell"
"Segoe Print Bold"
"Segoe Script Bold"
"Segoe UI Historic"
"Segoe UI Symbol"
"SimHei"
"SimSun"
"SimSun-ExtB"
"Stencil"
"STHupo"
"STLiti"
"STXingkai"
"STXinwei"
"STZhongsong"
"Symbol"
"Tahoma"
"Tahoma Bold"
"Microsoft Tai Le"
"Microsoft Tai Le Bold"
"TH SarabunPSK Bold Italic"
"TH SarabunPSK Bold"
"TH SarabunPSK Italic"
"TH SarabunPSK"
"Times New Roman"
"Times New Roman Bold"
"Times New Roman Bold Italic"
"Times New Roman Italic"
"Trebuchet MS Bold Italic"
"Tunga"
"Tunga Bold"
"Tw Cen MT Bold Italic"
"Tw Cen MT Bold"
"Tw Cen MT Condensed Bold"
"Tw Cen MT Condensed Extra Bold"
"Tw Cen MT Condensed"
"Tw Cen MT Italic"
"Tw Cen MT"
"Verdana Bold Italic"
"Verdana Bold"
"Verdana Italic"
"Verdana"
"Webdings"
"Wide Latin"
"Wingdings 2"
"Wingdings 3"
"Wingdings"
"Yu Gothic Bold"
"Yu Gothic UI Bold"
"Yu Gothic UI Semibold"
"Yu Gothic Light"
"Yu Gothic UI Light"
"Yu Gothic Medium"
"Yu Gothic UI Regular"
"Yu Gothic Regular"
"Yu Gothic UI Semilight"
"Yu Mincho Regular"
"Yu Mincho Demibold"
"Yu Mincho Light"
"Arial"
"Arial Bold"
"Arial Bold Italic"
"Arial Italic"
"Comic Sans MS Bold"
"Times New Roman"
"Times New Roman Bold"
"Times New Roman Bold Italic"
"Times New Roman Italic"
"Verdana Bold Italic"
"Verdana Bold"
"Verdana Italic"
"Verdana"
"Webdings"

@ondrs Unfortunately, I still cannot reproducer the problem on my side. I have used hte following code for testing:

Document doc = new Document("/temp/in.docx");
doc.setWarningCallback(new FontSubstitutionWarningCollector());
doc.save("/temp/out.pdf");
private static class FontSubstitutionWarningCollector implements IWarningCallback {

    public void warning(WarningInfo info) {
        if (info.getWarningType() == WarningType.FONT_SUBSTITUTION) {
            System.out.println(info.getDescription());
        }
    }
}

I have used the latest 23.6 version of Aspose.Words for Java for testing. Could you please try implementing IWarningCallback as in my example and let us know what warnings you see in the output? Also, please try using the latest version on your side.

2023-07-04 11:42:48,275 [nREPL-session-1bbea322-f963-48d6-910c-aced84c6fe74] WARN  [de.doc-converter.convert.aspose] - Import of element 'shapedefaults' is not supported in Docx format by Aspose.Words.
2023-07-04 11:42:48,792 [nREPL-session-1bbea322-f963-48d6-910c-aced84c6fe74] WARN  [de.doc-converter.convert.aspose] - Import of element 'extraClrSchemeLst' is not supported in Docx format by Aspose.Words.
2023-07-04 11:42:51,043 [nREPL-session-1bbea322-f963-48d6-910c-aced84c6fe74] WARN  [de.doc-converter.convert.aspose] - Font 'Calibri' has not been found. Using 'Liberation Sans' font instead. Reason: table substitution.
2023-07-04 11:42:51,314 [nREPL-session-1bbea322-f963-48d6-910c-aced84c6fe74] WARN  [de.doc-converter.convert.aspose] - Font 'Calibri Light' has not been found. Using 'DejaVu Sans' font instead. Reason: font info substitution.

And the output from top

top - 11:46:16 up 10 min,  0 users,  load average: 1.51, 1.22, 0.62
Tasks:   3 total,   1 running,   2 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.8 us,  0.1 sy,  0.0 ni, 83.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 11234332 total,  4118068 free,  2186700 used,  4929564 buff/cache
KiB Swap:  1048572 total,  1045220 free,     3352 used.  8380688 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                     
    1 de        20   0 5920644   1.3g  31436 S  99.7 12.3   4:53.01 java                                                        
  116 de        20   0   11844   2772   2536 S   0.0  0.0   0:00.00 sh                                                          
  122 de        20   0   56216   3732   3204 R   0.0  0.0   0:00.00 top

@ondrs Thank you for additional information. Could you please also attach the following fonts from your environment, where the problem is reproducible:

  • ‘Liberation Sans’
  • ‘DejaVu Sans’

Also, have you tried running the same conversion using the latest 23.6 version of Aspose.Words for Java. If your license does not allow to update to the latest version, you can request a free 30-days temporary license for testing.

Archive.zip (3.6 MB)
Fonts attached.

I haven’t tried the latest version yet. It would take some time for us to update.
If possible, I would like to first try to identify and verify the problem with version 22.4.
Is it possible for you to replicate it?
Have you noticed a similar bug before?

Thank you in advance

@ondrs The problem is reproducible with 22.4 version. I have checked with other versions and the problem does not occur starting from 22.7 version. According to release notes there were fixed several issues with hanging upon building document layout. So I would suggest you to update to the newer version of Aspose.Words.

Verified - the problem is indeed fixed in newer versions.

Thank you for your help!

1 Like